Skip to main content

A decorator for writing coroutine-like spider callbacks.

Project description

Scrapy Inline Requests

https://img.shields.io/pypi/v/scrapy-inline-requests.svg https://img.shields.io/travis/rolando/scrapy-inline-requests.svg Documentation Status

A decorator for writing coroutine-like spider callbacks.

Requires Scrapy>=1.0 and supports Python 2.7+ and 3.4+.

Usage

The spider below shows a simple use case of scraping a page and following a few links:

from scrapy import Spider, Request
from inline_requests import inline_requests

class MySpider(Spider):
    name = 'myspider'
    start_urls = ['http://httpbin.org/html']

    @inline_requests
    def parse(self, response):
        urls = [response.url]
        for i in range(10):
            next_resp = yield Request(response.urljoin('?page=%d' % i))
            urls.append(next_resp.url)
        yield {'urls': urls}

See the examples/ directory for a more complex spider.

Known Issues

  • Middlewares can drop or ignore non-200 status responses causing the callback to not continue its execution. This can be overcome by using the flag handle_httpstatus_all. See the httperror middleware documentation.

  • High concurrency and large responses can cause higher memory usage.

  • This decorator assumes your method have the following signature (self, response).

  • The decorated method must return a generator instance.

History

0.3.0 (2016-06-24)

  • Backward incompatible change: Added more restrictions to the request object (no callback/errback).

  • Cleanup callback/errback attributes before sending back the request to the generator.

  • Simplified example spider.

0.2.0 (2016-06-23)

  • Python 3 support.

0.1.2 (2016-05-22)

  • Scrapy API and documentation updates.

0.1.1 (2013-02-03)

  • Minor tweaks and fixes.

0.1.0 (2012-02-03)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-inline-requests-0.3.0.tar.gz (223.2 kB view hashes)

Uploaded Source

Built Distribution

scrapy_inline_requests-0.3.0-py2.py3-none-any.whl (9.6 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page