python-scrapyd-api

A Python wrapper for working with the Scrapyd API

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: BSD License
Natural Language
- English
Programming Language
Topic
- Internet :: WWW/HTTP

Project description

A Python wrapper for working with Scrapyd’s API.

Allows a Python application to talk to, and therefore control, the Scrapy daemon: Scrapyd.

Supports Python 2.6, 2.7, 3.3 & 3.4
Free software: BSD license
Full documentation
On the Python Package Index (PyPI)
Scrapyd’s API Documentation

Install

Easiest installation is via pip:

pip install python-scrapyd-api

Quick Usage

Please refer to the full documentation for more detailed usage but to get you started:

>>> from scrapyd_api import ScrapydAPI
>>> scrapyd = ScrapydAPI('http://localhost:6800')

Add a project egg as a new version:

>>> egg = open('some_egg.egg')
>>> scrapyd.add_version('project_name', 'version_name', egg)
# Returns the number of spiders in the project.
3
>>> egg.close()

Cancel a scheduled job:

>>> scrapyd.cancel('project_name', '14a6599ef67111e38a0e080027880ca6')
# Returns True if the request was met with an OK response.
True

Delete a project and all sibling versions:

>>> scrapyd.delete_project('project_name')
# Returns True if the request was met with an OK response.
True

Delete a version of a project:

>>> scrapyd.delete_version('project_name', 'version_name')
# Returns True if the request was met with an OK response.
True

Request status of a job:

>>> scrapyd.job_status('project_name', '14a6599ef67111e38a0e080027880ca6')
# Returns 'running', 'pending', 'finished' or '' for unknown state.
'running'

List all jobs registered:

>>> scrapyd.list_jobs('project_name')
# Returns a dict of running, finished and pending job lists.
{
    'pending': [
        {
            u'id': u'24c35...f12ae',
            u'spider': u'spider_name'
        },
    ],
    'running': [
        {
            u'id': u'14a65...b27ce',
            u'spider': u'spider_name',
            u'start_time': u'2014-06-17 22:45:31.975358'
        },
    ],
    'finished': [
        {
            u'id': u'34c23...b21ba',
            u'spider': u'spider_name',
            u'start_time': u'2014-06-17 22:45:31.975358',
            u'end_time': u'2014-06-23 14:01:18.209680'
        }
    ]
}

List all projects registered:

>>> scrapyd.list_projects()
[u'ecom_project', u'estate_agent_project', u'car_project']

List all spiders available to a given project:

>>> scrapyd.list_spiders('project_name')
[u'raw_spider', u'js_enhanced_spider', u'selenium_spider']

List all versions registered to a given project:

>>> scrapyd.list_versions('project_name'):
[u'345', u'346', u'347', u'348']

Schedule a job to run with a specific spider:

# Schedule a job to run with a specific spider.
>>> scrapyd.schedule('project_name', 'spider_name')
# Returns the Scrapyd job id.
u'14a6599ef67111e38a0e080027880ca6'

Schedule a job to run while passing override settings:

>>> settings = {'DOWNLOAD_DELAY': 2}
>>> scrapyd.schedule('project_name', 'spider_name', settings=settings)
u'25b6588ef67333e38a0e080027880de7'

Schedule a job to run while passing extra attributes to spider initialisation:

>>> scrapyd.schedule('project_name', 'spider_name', extra_attribute='value')
# NB: 'project', 'spider' and 'settings' are reserved kwargs for this
# method and therefore these names should be avoided when trying to pass
# extra attributes to the spider init.
u'25b6588ef67333e38a0e080027880de7'

Setting up the project to contribute code

Please see CONTRIBUTING.rst which is also mirrored in the full documentation. This will guide you through our pull request guidelines, project setup and testing requirements.

License

2-clause BSD. See the full LICENSE.

History

0.2.0 (2015-01-14)

Added the new job_status method which can retrieve the job status of a specific job from a project. See docs for usage.
Increased and improved test coverage.

0.1.0 (2014-09-16)

First release on PyPI.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: BSD License
Natural Language
- English
Programming Language
Topic
- Internet :: WWW/HTTP

Release history Release notifications | RSS feed

2.1.2

Apr 1, 2018

2.1.0

Mar 31, 2018

2.0.1

Mar 27, 2016

2.0.0

Mar 27, 2016

This version

0.2.0

Jan 15, 2015

0.1.0

Sep 16, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-scrapyd-api-0.2.0.tar.gz (23.3 kB view hashes)

Uploaded Jan 15, 2015 Source

Built Distribution

python_scrapyd_api-0.2.0-py2.py3-none-any.whl (10.3 kB view hashes)

Uploaded Jan 15, 2015 Python 2 Python 3

Hashes for python-scrapyd-api-0.2.0.tar.gz

Hashes for python-scrapyd-api-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`674093e2cd0e3d0802bf91d24379713ae4427d01642f25aee1b1c4f2710d187c`
MD5	`1d4dd817b9cb6408780285d84213713e`
BLAKE2b-256	`8e0775da8d9b274592b05771be6753e647ba31cfd85f29896874865b0da66bb6`

Hashes for python_scrapyd_api-0.2.0-py2.py3-none-any.whl

Hashes for python_scrapyd_api-0.2.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`65d689b28878682bfceb93cc4dcda26261fc6da91d8bb7683f71a0263b370adc`
MD5	`75c4bf08f729d2afb5ca9f56cd1e79bc`
BLAKE2b-256	`7ff86d7450dc0e6c5446776024096d2ba541ab9e1de2a1282b6345cfd1e5203d`