skip to navigation
skip to content

Index of Packages Matching 'scrapy'

Package Weight* Description
python-scrapyd-api 2.0.1 9 A Python wrapper for working with the Scrapyd API
s01.scrapy 0.16.2 9 Package for buildout based scrapy spider development
Scrapy 1.3.3 9 A high-level Web Crawling and Web Scraping framework
scrapy-beautifulsoup 0.0.2 9 Simple Scrapy middleware to process non-well-formed HTML with BeautifulSoup
scrapy-heroku 0.7.1 9 Utilities for running scrapy on heroku
scrapy-itemagic 0.2.4 9 Scrapy item parsing tools.
scrapy-nimbus 0.5.3 9 scrapy extend
scrapy-qiniu 0.1.2 9 Scrapy pipeline extension for
scrapy-random-useragent 0.2 9 Scrapy Middleware to set a random User-Agent for every Request.
scrapy-redirect 0.1.0 9 Restrict authorized Scrapy redirections to the website start_urls
scrapy-redis 0.6.8 9 Redis-based components for Scrapy.
scrapy-statsd 1.0.0a1 9 Publish Scrapy stats to statsd
scrapy-venom 0.1 9 Generic classes to deal with data scraping using Scrapy
scrapy-wayback-machine 1.0.0 9 A Scrapy middleware for scraping Wayback Machine snapshots from
scrapy_model 0.1.5 9 Scrapy helper to create scrapers from models
scrapybox 0.1 9 A Scrapy GUI
scrapyd-mongodb 0.1.0 9 Scrapyd Queue Management with MongoDB
ScrapyElasticSearch 0.8.9 9 Scrapy pipeline which allow you to store multiple scrapy items in Elastic Search.
scrapyscript 0.0.6 9 Run scrapy spider from a script or a Celery task - no project required
scrapy-dblite 0.2.7 8 Simple library for storing Scrapy Items in sqlite database
scrapy-djangoitem 1.1.1 8 Scrapy extension to write scraped items using Django models
scrapy-dotpersistence 0.3.0 8 Scrapy extension to sync `.scrapy` folder to an S3 bucket
scrapy-elasticsearch-bulk-item-exporter 0.2 8 An extension of Scrapys JsonLinesItemExporter that exports to elasticsearch bulk format.
scrapy-feedexporter-sftp 1.0.4 8 Scrapy extension Feed Exporter Storage Backend to export items to an SFTP server
scrapy-jsonrpc 0.3.0 8 Scrapy extenstion to control spiders using JSON-RPC
scrapy-lambda 0.1.1 8 Scrapy pipeline which invokes a lambda with the scraped item
scrapy-mongodb 0.9.1 8 Pipeline to MongoDB for Scrapy. Supports MongoDB replica sets
scrapy-pagestorage 0.2.0 8 Scrapy extension to store info in storage service
scrapy-rethinkdb 0.0.4 8 Scrapy pipeline for rethinkdb.
scrapy-vampire 0.1.0 8 utils for scrapy
scrapyapperyio 0.1.3 8 Scrapy pipeline which allows you to store scrapy items in database.
ScrapyCouchDB 0.2 8 Scrapy pipeline which allow you to store scrapy items in CouchDB database.
scrapyd 1.2.0 8 A service for running Scrapy spiders, with an HTTP API
scrapyd-client 1.1.0 8 A client for scrapyd
ScrapyGraphite 0.2 8 Output scrapy statistics to carbon/graphite.
ScrapyMongoDB 0.4.3 8 Scrapy pipeline which allow you to store scrapy items in MongoDB database.
scrapysolr 0.2.0 8 Scrapy pipeline which allows you to store scrapy items in a Solr server.
django-scrapy 0.1a1 7 apis for you can use scrapy in django
scrapinghub-entrypoint-scrapy 0.10.1 7 Scrapy entrypoint for Scrapinghub job runner
scrapy-boilerplate 0.2.1 7 Small set of utilities to simplify writing Scrapy spiders.
scrapy-crawl-once 0.1.1 7 Scrapy middleware which allows to crawl only new content
scrapy-crawlera 1.2.2 7 Crawlera middleware for Scrapy
scrapy-deltafetch 1.2.1 7 Scrapy middleware to ignore previously crawled pages
scrapy-eagle 0.0.37 7 Run Scrapy Distributed
scrapy-fake-useragent 1.0.2 7 Use a random User-Agent provided by fake-useragent every request
scrapy-feedexporter-azure-blob 0.2 7 Scrapy extension Feed Exporter Storage Backend to export items to a Azure blob container
scrapy-hcf 1.0.0 7 Scrapy spider middleware to use Scrapinghub's Hub Crawl Frontier as a backend for URLs
scrapy-html-storage 0.3.0 7 Scrapy downloader middleware that stores response HTML files to disk.
scrapy-inline-requests 0.3.1 7 A decorator for writing coroutine-like spider callbacks.
scrapy-jsonschema 0.1.0 7 Scrapy schema validation pipeline and Item builder using JSON Schema
scrapy-kafka 0.1.1 7 Kafka-based components for Scrapy
scrapy-magicfields 1.1.0 7 Scrapy middleware to add extra "magic" fields to items
scrapy-mongodb-queue 0.1.0 7 MongoDB-based components for Scrapy
scrapy-mosquitera 0.1.1 7 Restrict crawl and scraping scope using matchers.
scrapy-multifeedexporter 0.1.1 7 Export scraped items of different types to multiple feeds.
scrapy-proxy-rotator 0.1.0 7 Scrapy downloader middleware that rotates proxies.
scrapy-querycleaner 1.0.0 7 Scrapy spider middleware to clean up query parameters in request URLs
scrapy-rabbitmq 0.1.2 7 RabbitMQ Plug-in for Scrapy
scrapy-rabbitmq-link 0.2.0 7 RabbitMQ Plug-in for Scrapy
scrapy-rotating-proxies 0.3.1 7 Rotating proxies for Scrapy
scrapy-rss 0.1.2 7 RSS Tools for Scrapy Framework
scrapy-s3-cache 0.0.1 7 Use S3 as a cache backend in Scrapy projects.
scrapy-save-statistics 0.2 7 Scrapy Save Statistics: Save statistics extension for Scrapy
scrapy-sentry 0.7.0 7 Sentry component for Scrapy
scrapy-spiderdocs 0.1.2 7 Generate spiders md documentation based on spider docstrings.
scrapy-splash 0.7.2 7 JavaScript support for Scrapy using Splash
scrapy-splitvariants 1.1.0 7 Scrapy spider middleware to split an item into multiple items on a multi-valued key
scrapy-sqlitem 0.1.2 7 Scrapy extension to save items to a sql database
scrapy-statsd-middleware 0.0.8 7 Statsd integration middleware for scrapy
scrapy-status-mailer 0.3 7 Scrapy Status Mailer: Status mailer extension for Scrapy
scrapy-twostage 0.0.4 7 Two stage Scrapy spider: download and extract
scrapyd-heroku 0.1.0 7 A wrapper for running Scrapyd in Heroku or in console as normal Scrapyd service
scrapyd-subspider 0.0.2 7 scrapyd-subspider
scrapyd_kit 0.1.6 7 A kit for extending Scrapyd
scrapydo 0.2.2 7 Crochet-based blocking API for Scrapy.
scrapyjs 0.2 7 JavaScript support for Scrapy using Splash
scrapylib 1.7.1 7 Scrapy helper functions and processors
scrapymon 0.1.0 7 Simple management UI for scrapyd
scrapyrt 0.10 7 Put Scrapy spiders behind an HTTP API
scrapyrwiki 0.2 7 A collection of helpers for running Scrapy in ScraperWiki
scrapyz 0.3.3 7 Scrape Easy
avc-scrapy-helper 0.0.3 6 scrapy helper for scrapy.
elasticstats-scrapy 0.1.5 6 A scrapy extension to send crawl stats to elasticsearch index.
scrapy-amazon-robot-middleware3 0.2.1 6 Scrapy middleware module which uses image parsing to submit a captcha response to amazon.
scrapy-broadsoftxchange 1.17 6 Download documents and published software from Broadsoft Xchange
scrapy-corenlp 0.2.0 6 Scrapy spider middleware :: Stanford CoreNLP Named Entity Recognition
scrapy-dynamodb 0.2 6 AWS DynamoDB pipeline for Scrapy
scrapy-elves 0.1.0 6 utils for parse html
scrapy-proxies 0.3 6 Scrapy Proxies: random proxy middleware for Scrapy
scrapy-proxymesh 0.0.3 6 Proxymesh downloader middleware for Scrapy
scrapy-streamitem 0.1.0 6 Scrapy support for working with streamcorpus Stream Items
scrapy-tools 0.0.1 6 tools for scrapy_tools, consist of middlewares
ScrapyDot 0.1 6 Export a graph of link between crawled items in dot file format.
scrapymongocache 0.1.0 6 A base pipeline and a decorator to allow you to cache item fields in MongoDB collections.
xonsh-scrapy-tabcomplete 0.3 6 scrapy tabcomplete support for the Xonsh shell
cyberplant-Scrapy 1.2.0.dev2 5 A high-level Web Crawling and Web Scraping framework
frontoxy 1.0.3 5 Distributed URLs frontier for Scrapy with RabbitMQ
noscrapy 0.0.2 5 Python port attempt of web-scraper-chrome-extension.
s01.core 0.5.0 5 Scrapy worker core packages
s01.demo 0.16.2 5 Buildout based scrapy spider demo package for s01.worker
s01.worker 0.5.0 5 Scrapy worker based on buildout with JSON-RPC 2.0 API
scrapy-notifications 0.1.0 5 Send HTTP notifications on spider actions
scrapydd 0.4.21 5 UNKNOWN
scutils 1.2.0 5 Utilities for Scrapy Cluster
stickymeta 0.0.5 5 Handy tools to maintain persistent meta values between requests in Scrapy spiders
custom-manager 0.7 4 A scrapy proxies handling
scrapoxy 1.7 4 Use Scrapoxy with Scrapy
scrapy-cdr 0.3.0 4
scrapy-sqs-pipeline 0.1.1 4 Write scraped items to Amazon SQS.
scrapy-ui 0.1 4 Placeholder package
web-walker 3.1.4 4 your can crawl web pages with litte settings. based on scrapy.
arachnado 0.2 3 Scrapy-based Web Crawler with an UI
Arachne 0.5.0 3 API for Scrapy spiders
autologin-middleware 0.1.6 3 A Scrapy middleware to use with autologin
django-dynamic-scraper 0.11.6 3 Creating Scrapy scrapers via the Django admin interface
django-scraoy 0.1a1 3 apis for you can use scrapy in django
habra-favorites 1.4.0 3 Sort your favorites posts from
hepcrawl 0.3.6 3 Scrapy project for feeds into INSPIRE-HEP (
hero-crawl 0.1.4 3 Helpers for Scrapy and Flask on Heroku
inspire-crawler 0.2.11 3 Crawler integration with INSPIRE-HEP.
pa11ycrawler 1.6.2 3 A Scrapy spider for a11y auditing Open edX installations.
portia2code 0.0.12 3 Convert portia spider definitions to python scrapy spiders
proxy-middleware 0.1.1 3 Scrapy http proxy middleware that gets proxy parameters from settings
pyspider 0.3.9 3 A Powerful Spider System in Python
retr 1.2 3 Parallelised proxy-switching engine for scraping. Based on requests or aiohttp (alpha). Includes scrapy middleware.
s01.client 0.5.0 3 JSON-RPC 2.0 s01.worker client
sitesearcher 0.1a2 3 A command line tool that creates fulltext search indexes of your favourite websites on your machine, and allows you to search them locally
weblocust 1.0.3 3 A more Powerful Spider System in Python based on pyspider
abu-quant 0.0.1 2 股票量化
abupy 0.0.4 2 强大的股票量化库
abuQuant 0.0.1 2 股票量化
aduana 0.2.1 2 Bindings for Aduana library
afaq-dl 1.0 2 Download the online book An Anarchist FAQ
crappyspider 0.3 2 Test your site.
crawl-frontier 0.2.0.post0.dev29 2 A flexible frontier for web crawlers
digs 0.1.7 2 Making easier the text crawling tasks over websites with depth levels.
distributed-frontera 0.2.0 2 [deprecated] Distributed version of Frontera, flexible frontier for web crawlers
easyspider 0.1.1 2 UNKNOWN
fara_principals 0.0.7 2 A web scraper designed to collect Foreign Principal information from
frontera 0.7.1 2 A scalable frontier for web crawlers
ilmwetter 0.2 2 Scrapy spider for the weather in Ilmenau
incapsula-cracker 0.1.3 2 A way to bypass incapsula robot checks when using requests or scrapy.
page_clustering 0.0.1 2 Online k-means clustering of web pages
page_finder 0.1.9 2
parsel 1.1.0 2 Parsel is a library to extract data from HTML and XML using XPath and CSS selectors
pycameo 0.1.2a1 2 cameo scrapy project
queuelib 1.4.2 2 Collection of persistent (disk-based) queues
scrapely 0.13.3 2 A pure-python HTML screen-scraping library
scrapple 0.3.0 2 A framework for creating web content extractors
slybot 0.13.0b37 2 Slybot crawler
take 0.2.0 2 A DSL for extracting data from a web page.
w3lib 1.17.0 2 Library of web-related functions
wayback-machine-scraper 1.0.7 2 A command-line utility for scraping Wayback Machine snapshots from
ze 0.0.17.dev1 2 Scaper to lager portal of news in Brazil.
ze-the-scraper 0.0.17.dev1 2 Scaper to lager portal of news in Brazil.
aranha 0.1.1 1 simple gevent based web spider and tools
assignment 0.3 1 calculation of minimum price of given book
autologin 0.1.3 1 A utility for finding login links, forms and autologging into websites with a set of valid credentials.
autopager 0.2 1 Detect and classify pagination links on web pages
autoresponse 0.3.1 1 UNKNOWN
checklists 0.4.12 1 A reusable Django model for managing checklists of birds.
checklists_scrapers 0.2.3 1 Web scrapers for downloading checklists of birds from onlinedatabases such as eBird.
chinaAQI 0.1.2 1 the data scrapied from
crawlmi 0.1.8 1 Highly optimized web scraping framework.
creepy 0.1.6 1 Dead simple web crawler for Python
Crwy 1.0.3 1 A Simple Web Crawling and Web Scraping framework
cssselect 1.0.1 1 cssselect parses CSS3 Selectors and translates them to XPath 1.0
datalad 0.5.1 1 data distribution geared toward scientific datasets
Django-Youtuber 0.1.1 1 Django app for fetching videos from channels or playlists.
DocDownloader 1.3.2 1 Downloads Documentation from ReadTheDocs in multiple formats
docxpy 0.8.4 1 A pure python-based utility to extract text, hyperlinks and imagesfrom docx files.
extruct 0.3.0a2 1 Extract embedded metadata from HTML markup
freebora 0.1.0 1
hsdata 0.2.16 1 用数据玩炉石!快速收集和分析炉石传说的卡牌及卡组数据。
hubstorage 0.23.6 1 Client interface for Scrapinghub HubStorage
icrawler 0.3.5 1 A mini framework of image crawlers
imagebot 1.2.1 1 A web bot to crawl websites and scrape images.
loginform 1.2.0 1 Fill HTML login forms automatically
main 0.1 1 calculation of minimum price of given book
MaybeDont 0.1.0 1 A component that tried to avoid downloading duplite content
mdr 0.0.1 1 python library to detect and extract listing data from HTML page
minreq 0.1.0 1 Check required data in a request.
mp3_zing_downloader 1.3.3 1 Downloader to download music from
music_scraper 1.1.0 1 Gets Songs from the web and allows users to download the same
parsel-cli 0.2.0 1 Parsel Command Line Interface
persist-queue 0.2.1 1 A thread-safe disk based persistent queue in Python.
photopipe 0.1.0b4 1 PhotoPipe is a pipeline for automated reduction, photometry and astrometry of imaging data from RATIR and RIMAS.
pomp 0.2.1 1 Screen scraping and web crawling framework
pqueue 0.1.7 1 A single process, persistent multi-producer, multi-consumer queue.
ProxyYourSpider 1.0.2 1 Proxy your spider and crawl the galaxy.
PyNSXv 0.4.1 1 PyNSXv is a higher level python based library and CLI tool to control NSX for vSphere
pyscrap 0.0.9 1 micro framework for web scraping
pystock-crawler 0.8.2 1 Crawl and parse stock historical data
rbco.recipe.pyeclipse 0.0.5 1 Creates a Pydev project for Eclipse.
rio-client 0.3.0 1 Client for Rio.
scrapekit 0.2.1 1 Light-weight tools for web scraping
shub-image 0.2.5 1 Scrapinghub release tool
social 0.1.dev3 1 Social is a python package for social data analysis and exploitation
soft404 0.2.0 1 A classifier for detecting soft 404 pages
splash 2.3.2 1 A javascript rendered with a HTTP API
splinter_model 0.1.6 1 Splinter helper to create scrapers from models
stereo 0.1.2 1 Generate documents from CSV file in batch
structominer 0.2.0 1 Data scraping for a more civilized age
wallapopy 1.0.3 1 A Python client for Wallapop

*: occurrence of search term weighted by field (name, summary, keywords, description, author, maintainer)