Skip to main content

Python MediaWiki Bot Framework

Project description

Github CI AppVeyor Build Status Code coverage Maintainability Python Pywikibot release Total downloads Monthly downloads

Pywikibot

The Pywikibot framework is a Python library that interfaces with the MediaWiki API version 1.23 or higher.

Also included are various general function scripts that can be adapted for different tasks.

For further information about the library excluding scripts see the full code documentation.

Quick start

pip install requests
git clone https://gerrit.wikimedia.org/r/pywikibot/core.git
cd core
git submodule update --init
python pwb.py script_name

Or to install using PyPI (excluding scripts)

pip install -U setuptools
pip install pywikibot
pwb <scriptname>

In addition a MediaWiki markup parser is required. Please install one of them:

pip install mwparserfromhell

or

pip install wikitextparser

Our installation guide has more details for advanced usage.

Basic Usage

If you wish to write your own script it’s very easy to get started:

import pywikibot
site = pywikibot.Site('en', 'wikipedia')  # The site we want to run our bot on
page = pywikibot.Page(site, 'Wikipedia:Sandbox')
page.text = page.text.replace('foo', 'bar')
page.save('Replacing "foo" with "bar"')  # Saves the page

Wikibase Usage

Wikibase is a flexible knowledge base software that drives Wikidata. A sample pywikibot script for getting data from Wikibase:

import pywikibot
site = pywikibot.Site('wikipedia:en')
repo = site.data_repository()  # the Wikibase repository for given site
page = repo.page_from_repository('Q91')  # create a local page for the given item
item = pywikibot.ItemPage(repo, 'Q91')  # a repository item
data = item.get()  # get all item data from repository for this item

Script example

Pywikibot provides bot classes to develop your own script easily:

import pywikibot
from pywikibot import pagegenerators
from pywikibot.bot import ExistingPageBot

class MyBot(ExistingPageBot):

    update_options = {
        'text': 'This is a test text',
        'summary: 'Bot: a bot test edit with Pywikibot.'
    }

    def treat_page(self):
        """Load the given page, do some changes, and save it."""
        text = self.current_page.text
        text += '\n' + self.opt.text
        self.put_current(text, summary=self.opt.summary)

def main():
    """Parse command line arguments and invoke bot."""
    options = {}
    gen_factory = pagegenerators.GeneratorFactory()
    # Option parsing
    local_args = pywikibot.handle_args(args)  # global options
    local_args = gen_factory.handle_args(local_args)  # generators options
    for arg in local_args:
        opt, sep, value = arg.partition(':')
        if opt in ('-summary', '-text'):
            options[opt[1:]] = value
    MyBot(generator=gen_factory.getCombinedGenerator(), **options).run()

if __name == '__main__':
    main()

For more documentation on Pywikibot see our docs.

Required external programs

It may require the following programs to function properly:

  • 7za: To extract 7z files

Roadmap

Current release 7.2.0

  • Make logging system consistent, add pywikibot.info() alias for pywikibot.output() (T85620)

  • L10N updates

  • Circumvent circular import in tools module (T306760)

  • Don’t fix html inside syntaxhighlight parts in fixes.py (T306723)

  • Make layer parameter optional in pywikibot.debug() (T85620)

  • Retry for internal_api_error_DBQueryTimeoutError errors due to T297708

  • Handle ParserError within xmlreader.XmlDump.parse() instead of raising an exception (T306134)

  • XMLDumpOldPageGenerator is deprecated in favour of a content parameter (T306134)

  • use_disambig BaseBot attribute was added to hande disambig skipping

  • Deprecate RedirectPageBot and NoRedirectPageBot in favour of use_redirects attribute

  • tools.formatter.color_format is deprecated and will be removed

  • A new and easier color format was implemented; colors can be used like:

    'this is a <<green>>colored<<default>> text'

  • Unused and unsupported xmlreader.XmlParserThread was removed

  • Use upercased IP user titles (T306291)

  • Use pathlib to extract filename and file_package in pwb.py

  • Fix isbn messages in fixes.py (T306166)

  • Fix Page.revisions() with starttime (T109181)

  • Use stream_output for messages inside input_list_choice method (T305940)

  • Expand simulate query result (T305918)

  • Do not delete text when updating a Revision (T304786)

  • Re-enable scripts package version check with pwb wrapper (T305799)

  • Provide textlib.ignore_case() as a public method

  • Don’t try to upcast timestamp from global userinfo if global account does not exists (T305351)

  • Archived scripts were removed; create a Phabricator task to restore some (T223826)

  • Add Lexeme support for Lexicographical data (T189321, T305297)

  • enable all parameters of APISite.imageusage() with FilePage.usingPages()

  • Don’t raise NoPageError with file_is_shared (T305182)

  • Fix URL of GoogleOCR

  • Handle ratelimit with purgepages() (T152597)

  • Add movesubpages parameter to Page.move() and APISite.movepage() (T57084)

  • Do not iterate over sys.modules (T304785)

Deprecations

  • Python 3.5 support will be dropped with Python 8 (T301908)

  • 7.2.0: XMLDumpOldPageGenerator is deprecated in favour of a content parameter (T306134)

  • 7.2.0: RedirectPageBot and NoRedirectPageBot bot classes are deprecated in favour of use_redirects attribute

  • 7.2.0: tools.formatter.color_format is deprecated and will be removed

  • 7.1.0: win32_unicode.py will be removed with Pywikibot 8

  • 7.1.0: Unused get_redirect parameter of Page.getOldVersion() will be removed

  • 7.1.0: APISite._simple_request() will be removed in favour of APISite.simple_request()

  • 7.0.0: The i18n identifier ‘cosmetic_changes-append’ will be removed in favour of ‘pywikibot-cosmetic-changes’

  • 7.0.0: User.isBlocked() method is renamed to is_blocked for consistency

  • 7.0.0: Require mysql >= 0.7.11 (T216741)

  • 7.0.0: Private BaseBot counters _treat_counter, _save_counter, _skip_counter will be removed in favour of collections.Counter counter attribute

  • 7.0.0: A boolean watch parameter in Page.save() is deprecated and will be desupported

  • 7.0.0: baserevid parameter of editSource(), editQualifier(), removeClaims(), removeSources(), remove_qualifiers() DataSite methods will be removed

  • 7.0.0: Values of APISite.allpages() parameter filterredir other than True, False and None are deprecated

  • 6.5.0: OutputOption.output() method will be removed in favour of OutputOption.out property

  • 6.5.0: Infinite rotating file handler with logfilecount of -1 is deprecated

  • 6.4.0: ‘allow_duplicates’ parameter of tools.intersect_generators as positional argument is deprecated, use keyword argument instead

  • 6.4.0: ‘iterables’ of tools.intersect_generators given as a list or tuple is deprecated, either use consecutive iterables or use ‘*’ to unpack

  • 6.2.0: outputter of OutputProxyOption without out property is deprecated

  • 6.2.0: ContextOption.output_range() and HighlightContextOption.output_range() are deprecated

  • 6.2.0: Error messages with ‘%’ style is deprecated in favour for str.format() style

  • 6.2.0: page.url2unicode() function is deprecated in favour of tools.chars.url2string()

  • 6.2.0: Throttle.multiplydelay attribute is deprecated

  • 6.2.0: SequenceOutputter.format_list() is deprecated in favour of ‘out’ property

  • 6.0.0: config.register_family_file() is deprecated

  • 5.5.0: APISite.redirectRegex() is deprecated in favour of APISite.redirect_regex() and will be removed with Pywikibot 8

  • 4.0.0: Revision.parent_id is deprecated in favour of Revision.parentid and will be removed with Pywikibot 8

  • 4.0.0: Revision.content_model is deprecated in favour of Revision.contentmodel and will be removed with Pywikibot 8

Release history

See https://github.com/wikimedia/pywikibot/blob/stable/HISTORY.rst

Contributing

Our code is maintained on Wikimedia’s Gerrit installation, learn how to get started.

Code of Conduct

The development of this software is covered by a Code of Conduct.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywikibot-7.2.0.tar.gz (566.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page