skip to navigation
skip to content

wikidump 0.1.3

Tools to manipulate and extract data from wikipedia dumps

wikidump

Introduction

This module contains code for manipulating wikipedia dumps available from http://download.wikimedia.org/backup-index.html

Installation

This module is published on PyPI and can be installed with easy_install

For example:

easy_install wikidump

Alternatively, you can use pip:

pip install wikidump

I highly recommend using virtualenv to isolate the install environment.

For those on ubuntu systems, a built package is available in a PPA. Please go to the PPA for details on how to install from it.

Configuration

Upon first importing the module, a file ‘wikidump.cfg’ will be created. Modify the paths in this file to point to your data.

  • scratch : where indices are stores (must be writeable)
  • xml_dumps : where the xml dumps are located (can be read-only)

Usage

In addition to python modules, wikidump also comes with a command-line tool to quickly access wikidump functionality. Run wikidump help for a list of options.

News

0.1

Release date: 04-Aug-2010

  • Initial release of wikidump module

0.1.3

Release date: 10-Apr-2013

  • Rewrote CLI
 
File Type Py Version Uploaded on Size
wikidump-0.1.3.tar.gz (md5) Source 2013-04-10 16KB