Skip to main content

Python MediaWiki Bot Framework

Project description

GitHub CI AppVeyor Build Status Code coverage Maintainability Python Top language Pywikibot release wheel Total downloads Monthly downloads Last commit

Pywikibot

The Pywikibot framework is a Python library that interfaces with the MediaWiki API version 1.27 or higher.

Also included are various general function scripts that can be adapted for different tasks.

For further information about the library excluding scripts see the full code documentation.

Quick start

git clone https://gerrit.wikimedia.org/r/pywikibot/core.git
cd core
git submodule update --init
pip install -r requirements.txt
python pwb.py <script_name>

Or to install using PyPI (excluding scripts)

pip install pywikibot
pwb <scriptname>

Our installation guide has more details for advanced usage.

Basic Usage

If you wish to write your own script it’s very easy to get started:

import pywikibot
site = pywikibot.Site('en', 'wikipedia')  # The site we want to run our bot on
page = pywikibot.Page(site, 'Wikipedia:Sandbox')
page.text = page.text.replace('foo', 'bar')
page.save('Replacing "foo" with "bar"')  # Saves the page

Wikibase Usage

Wikibase is a flexible knowledge base software that drives Wikidata. A sample pywikibot script for getting data from Wikibase:

import pywikibot
site = pywikibot.Site('wikipedia:en')
repo = site.data_repository()  # the Wikibase repository for given site
page = repo.page_from_repository('Q91')  # create a local page for the given item
item = pywikibot.ItemPage(repo, 'Q91')  # a repository item
data = item.get()  # get all item data from repository for this item

Script example

Pywikibot provides bot classes to develop your own script easily:

import pywikibot
from pywikibot import pagegenerators
from pywikibot.bot import ExistingPageBot

class MyBot(ExistingPageBot):

    update_options = {
        'text': 'This is a test text',
        'summary': 'Bot: a bot test edit with Pywikibot.'
    }

    def treat_page(self):
        """Load the given page, do some changes, and save it."""
        text = self.current_page.text
        text += '\n' + self.opt.text
        self.put_current(text, summary=self.opt.summary)

def main():
    """Parse command line arguments and invoke bot."""
    options = {}
    gen_factory = pagegenerators.GeneratorFactory()
    # Option parsing
    local_args = pywikibot.handle_args(args)  # global options
    local_args = gen_factory.handle_args(local_args)  # generators options
    for arg in local_args:
        opt, sep, value = arg.partition(':')
        if opt in ('-summary', '-text'):
            options[opt[1:]] = value
    MyBot(generator=gen_factory.getCombinedGenerator(), **options).run()

if __name == '__main__':
    main()

For more documentation on Pywikibot see our docs.

Roadmap

Current release

Improvements
  • Python 3.13 is supported

  • Update tools._unidata._category_cf from Unicodedata version 15.1.0

  • Timestamp.nowutc()and Timestamp.utcnow()were added (T337748)

  • Remove content parameter of proofreadpage.IndexPage.page_genmethod. (T358635)

  • Backport itertools.batched from Python 3.13 to backports.batched

  • A copy button was added to the sphinx documentation.

  • Make languages_by_sizedynamic (T78396). The property is only available for family.WikimediaFamilyfamilies. The wikimedia_sites.py maintenance script was removed.

  • Add config.base_dirto scripts search path with pwbwrapper (T324287)

  • pywikibot.WbTime.equal_instantwas added (T325248)

  • revisions parameter of xmlreader.XmlDumpwas introduced to specify parsing method (T340804)

  • Pass global -nolog argument into bot script from wrapper (T328900)

  • Add site.APISite.ratelimit()method and tools.collections.RateLimitNamedTuple (T304808)

  • L10N and i18n updates

  • Add pagegenerators.PagePilePageGenerator(T353086)

Bugfixes
  • Timestamp.now()and Timestamp.fromtimestamp()also returns a Timestampobject with Python 3.7

  • Populate pywikibot.MediaInfo._content with expected attributes when loaded (T357608)

  • Raise exceptions.APIErrorif the same error comes twice within data.api.Request.submitloop (T357870)

  • Only delegate sitemethods to public family.Familymethods which have code as first parameter.

  • Use str instead of repr for several messages with family.Familyobjects (T356782)

  • Add hy to special languages in textlib.TimeStripper(T356175)

  • Pass login token when using action=login (T309898)

  • Detect range blocks with pywikibot.User.is_blocked(T301282)

  • Use only end tags in ElementTree.iterparse in xmlreadermodule (T354095)

  • Suppress error in cosmetic_changes.CosmeticChangesToolkit.cleanUpLinks(T337045)

  • pywikibot.input_choicevalidates default parameter (T353097)

  • Remove typing imports from user-config.py file (T352965)

Breaking changes and code cleanups
  • Cache directory was renamed from apicache-py3 to apicache due to timestamp changes. (T337748) Warning: Do not use Pywikibot 9+ together with Pywikibot 3.0.20181203 and below.

  • Raise TypeError instead of AttributeError in Site.randompages() if redirects parameter is invalid.

  • A RuntimeError will be raised if a family.Familysubclass has an __init__ initializer method. family.Family.__post_init__classmethod can be used instead.

  • InteractiveReplacewas moved from botto bot_choicemodule

  • userinterfaces.transliteration.transliterator was renamed to Transliterator

  • pywikibot.BaseSite and pywikibotAPISite were dropped. pywikibot.Sitehas to be used to create a siteobject.

  • next parameter of userinterfaces.transliteration.Transliterator.transliteratewas renamed to succ

  • type parameter of site.APISite.protectedpages() was renamed to protect_type

  • all parameter of site.APISite.namespace()was renamed to all_ns

  • filter parameter of date.dhwas renamed to filter_func

  • dict parameter of data.api.OptionSetwas renamed to data

  • setuptools package is no longer mandatory but required for tests (T340640, T347052, T354515)

  • root attribute of xmlreader.XmlDumpwas removed

  • tools.Version class was removed; use classes from packaging.version instead (T340640)

  • packaging package is mandatory; importlib_metadata package is required for Python 3.7 (T340640)

  • SelfCallMixin, SelfCallDict and SelfCallString of toolsmodule were removed

  • Calling site.BaseSite.sitenameas a function is no longer supported

  • config.register_family_file() function was removed

  • require PyMySQL >= 1.0.0 if necessary

  • keys() and items() methods of data.api.Requestgives a view instead a list (T310953)

  • SequenceOutputter.format_list() was removed in favour of tools.formatter.SequenceOutputter.outproperty

  • output parameter of bot_choice.OutputProxyOption(i.e. OutputOption instance) without out property is no longer supported

  • OutputOption.output() method was removed

  • ContextOption.output_range() and HighlightContextOption.output_range() methods were removed

  • page.url2unicode() function was removed in favour of tools.chars.url2string

  • iterables of tools.itertools.intersect_generatorsmust not be given as a single list or tuple; either consecutive iterables must be used or ‘*’ to unpack

  • allow_duplicates parameter of tools.itertools.intersect_generatorsmust be given as keyword argument

  • Infinite rotating file handler with config.logfilesize of -1 is no longer supported

  • Throttle.multiplydelay attribute was removed

  • Python 3.6 support was dropped (T347026)

Deprecations

  • 9.0.0: The content parameter of proofreadpage.IndexPage.page_genis deprecated and will be ignored (T358635)

  • 9.0.0: userinterfaces.transliteration.transliterator was renamed to Transliterator

  • 9.0.0: next parameter of userinterfaces.transliteration.transliterator.transliteratewas renamed to succ

  • 9.0.0: type parameter of site.APISite.protectedpages() was renamed to protect_type

  • 9.0.0: all parameter of site.APISite.namespace()was renamed to all_ns

  • 9.0.0: filter parameter of date.dhwas renamed to filter_func

  • 9.0.0: dict parameter of data.api.OptionSetwas renamed to data

  • 9.0.0: pywikibot.version.get_toolforge_hostname() is deprecated without replacement

  • 9.0.0: allrevisions parameter of xmlreader.XmpDumpis deprecated, use revisions instead (T340804)

  • 9.0.0: iteritems method of data.api.Requestwill be removed in favour of items

  • 9.0.0: SequenceOutputter.output() is deprecated in favour of tools.formatter.SequenceOutputter.out property

  • 9.0.0: nullcontext context manager and SimpleQueue queue of backportsare derecated

  • 8.4.0: modules_only_mode parameter of data.api.ParamInfo, its paraminfo_keys class attribute and its preloaded_modules property will be removed

  • 8.4.0: dropdelay and releasepid attributes of throttle.Throttlewill be removed in favour of expiry class attribute

  • 8.2.0: tools.itertools.itergroupwill be removed in favour of backports.batched

  • 8.2.0: normalize parameter of WbTime.toTimestrand WbTime.toWikibasewill be removed

  • 8.1.0: Dependency of exceptions.NoSiteLinkErrorfrom exceptions.NoPageErrorwill be removed

  • 8.1.0: exceptions.Server414Error is deprecated in favour of exceptions.Client414Error

  • 8.0.0: Timestamp.clone()method is deprecated in favour of Timestamp.replace() method.

  • 8.0.0: family.Family.maximum_GET_lengthmethod is deprecated in favour of config.maximum_GET_length(T325957)

  • 8.0.0: addOnly parameter of textlib.replaceLanguageLinksand textlib.replaceCategoryLinksare deprecated in favour of add_only

  • 8.0.0: textlib.TimeStripperregex attributes ptimeR, ptimeznR, pyearR, pmonthR, pdayR are deprecated in favour of patterns attribute which is a textlib.TimeStripperPatterns.

  • 8.0.0: textlib.TimeStripper``groups`` attribute is deprecated in favour of textlib.TIMEGROUPS

  • 8.0.0: LoginManager.get_login_tokenwas replaced by login.ClientLoginManager.site.tokens['login']

  • 8.0.0: data.api.LoginManager() is deprecated in favour of login.ClientLoginManager

  • 8.0.0: APISite.messages()method is deprecated in favour of userinfo[‘messages’]

  • 8.0.0: Page.editTime()method is deprecated and should be replaced by Page.latest_revision.timestamp

Will be removed in Pywikibot 10
  • 7.7.0: tools.threadingclasses should no longer imported from tools

  • 7.6.0: tools.itertoolsdatatypes should no longer imported from tools

  • 7.6.0: tools.collectionsdatatypes should no longer imported from tools

  • 7.5.0: textlib.tzoneFixedOffset class will be removed in favour of time.TZoneFixedOffset

  • 7.4.0: FilePage.usingPages() was renamed to using_pages()

  • 7.3.0: Old color escape sequences like \03{color} is deprecated in favour of new color format like <<color>>

  • 7.3.0: linkitrail method of family.Familyis deprecated; use APISite.linktrail() instead

  • 7.2.0: tb parameter of exception()function was renamed to exc_info

  • 7.2.0: XMLDumpOldPageGenerator is deprecated in favour of a content parameter of XMLDumpPageGenerator(T306134)

  • 7.2.0: RedirectPageBot and NoRedirectPageBot bot classes are deprecated in favour of use_redirectsattribute

  • 7.2.0: tools.formatter.color_formatis deprecated and will be removed

  • 7.1.0: Unused get_redirect parameter of Page.getOldVersion()will be removed

  • 7.0.0: User.isBlocked() method is renamed to is_blocked for consistency

  • 7.0.0: A boolean watch parameter in Page.save() is deprecated and will be desupported

  • 7.0.0: baserevid parameter of editSource(), editQualifier(), removeClaims(), removeSources(), remove_qualifiers() DataSite methods will be removed

  • 7.0.0: Values of APISite.allpages() parameter filterredir other than True, False and None are deprecated

  • 7.0.0: The i18n identifier ‘cosmetic_changes-append’ will be removed in favour of ‘pywikibot-cosmetic-changes’

Release history

See https://github.com/wikimedia/pywikibot/blob/stable/HISTORY.rst

Contributing

Our code is maintained on Wikimedia’s Gerrit installation, learn how to get started.

Code of Conduct

The development of this software is covered by a Code of Conduct.

Project details


Release history Release notifications | RSS feed

This version

9.0.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywikibot-9.0.0.tar.gz (612.8 kB view hashes)

Uploaded Source

Built Distribution

pywikibot-9.0.0-py3-none-any.whl (713.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page