Skip to main content

Client library to process URLs through Zyte API

Project description

PyPI Version Supported Python Versions Automated tests Coverage report

Installation

pip install scrapy-zyte-api

This package requires Python 3.7+.

How to configure

Replace the default http and https in Scrapy’s DOWNLOAD_HANDLERS in the settings.py of your Scrapy project.

You also need to set the ZYTE_API_KEY.

DOWNLOAD_HANDLERS = {
    "http": "scrapy_zyte_api.handler.ScrapyZyteAPIDownloadHandler",
    "https": "scrapy_zyte_api.handler.ScrapyZyteAPIDownloadHandler"
}

# Having the following in the env var would also work.
ZYTE_API_KEY = "<your API key>"

Also, make sure to install the asyncio-based Twisted reactor in the settings.py file as well:

TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

How to use

Set the zyte_api Request.meta key to download a request using Zyte API. Full list of parameters is provided in the Zyte API Specification.

yield scrapy.Request(
    "http://books.toscrape.com/",
    callback=self.parse,
    meta={
        "zyte_api": {
            "browserHtml": True,
            "geolocation": "US",
            "javascript": True,
            "echoData": {"something": True}
        }
    }
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-zyte-api-0.1.0.tar.gz (4.1 kB view hashes)

Uploaded Source

Built Distribution

scrapy_zyte_api-0.1.0-py3-none-any.whl (4.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page