skip to navigation
skip to content

scrapelib 1.0.0

a library for scraping things

scrapelib is a library for making requests to less-than-reliable websites, it is implemented (as of 0.7) as a wrapper around requests.

scrapelib originated as part of the Open States project to scrape the websites of all 50 state legislatures and as a result was therefore designed with features desirable when dealing with sites that have intermittent errors or require rate-limiting.

Advantages of using scrapelib over alternatives like httplib2 simply using requests as-is:

  • All of the power of the suberb requests library.
  • HTTP, HTTPS, and FTP requests via an identical API
  • support for simple caching with pluggable cache backends
  • request throttling
  • configurable retries for non-permanent site failures

scrapelib is a project of Sunlight Labs released under a BSD-style license, see LICENSE for details.

Written by James Turk <>

  • Michael Stephens - initial urllib2/httplib2 version
  • Joe Germuska - fix for IPython embedding
  • Alex Chiang - fix to test suite


  • python 2.7, 3.3, 3.4
  • requests >= 1.0


scrapelib is available on PyPI and can be installed via pip install scrapelib

PyPI package:



Example Usage

import scrapelib
s = scrapelib.Scraper(requests_per_minute=10)

# Grab Google front page

# Will be throttled to 10 HTTP requests per minute
while True:
File Type Py Version Uploaded on Size
scrapelib-1.0.0-py2.py3-none-any.whl (md5) Python Wheel 2.7 2015-03-20 15KB
scrapelib-1.0.0.tar.gz (md5) Source 2015-03-20 13KB