skip to navigation
skip to content

grab 0.4.12

Site Scraping Framework

Package Documentation

Latest Version: 0.6.38

Grab is a python site scraping framework. Grab provides powerful interface to two libraries: lxml and pycurl. There are two ways how to use Grab: 1) Use Grab to configure network requests and to process fetched documents. In this way you should manually control flow of you program. 2) Use Grab::Spider to buld asynchronous site scrapers. This is how scrapy works.

Example of Grab usage:

from grab import Grab

g = Grab()
g.set_input('login', 'lorien')
g.set_input('password', '***')
for elem in'//ul[@id="repo_listing"]/li/a'):
    print '%s: %s' % (elem.text(), elem.attr('href'))

Example of Grab::Spider usage:

from grab.spider import Spider, Task
import logging

class ExampleSpider(Spider):
    def task_generator(self):
        for lang in ('python', 'ruby', 'perl'):
            url = '' % lang
            yield Task('search', url=url)

    def task_search(self, grab, task):

bot = ExampleSpider()


Pip is recommended way to install Grab and its dependencies:

$ pip install lxml
$ pip install pycurl
$ pip install grab


Russian docs: English docs in progress.

Discussion group (Russian or English):


If you found a bug or if you want new feature please create new issue on github:

File Type Py Version Uploaded on Size
grab-0.4.12.tar.gz (md5) Source 2013-07-25 138KB