Skip to main content

A web spider for collecting specific data across a set of configured sites

Project description

Parker
======

Parker is a Python-based web spider for collecting specific data
across a set of configured sites.

Non-Python requirements:

- Redis - for task queuing and visit tracking
- libxml - for HTML parsing of pages

Installation
------

Install using ``pip``::

$ pip install parker

Configuration
------

To configure Parker, you will need to install the configuration
files in a suitable location for the user running Parker. To do
this, use the ``parker-config`` script. For example::

$ parker-config ~/.parker

This will install the configuration in your homedir and will output
the related environment variable for you to set in your ``.bashrc``.


News
====

0.4.0
----------

- Added handling for a PARKER_CONFIG environment variable, allowing
users to specify where configuration files are loaded from.

- Added the ``parker-config`` script to install default configuration
files to a passed location. Also prints out an example PARKER_CONFIG
environment variable to add to your profile files.

- Updated documentation to use proper reStructuredText files.

- Add a CHANGES file to track updates.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Parker-0.4.0.tar.gz (137.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page