skip to navigation
skip to content

crawler 0.1.0

python crawler.

Latest Version: 0.1.2

python crawler.
=====
## Example
=====

from crawler.crawler import Crawler

mycrawler = Crawler()
seeds = ['http://www.example.com/'] # list of url
mycrawler.add_seeds(seeds)
url_patterns = ['^(.+example\.com)(.+)$'] # list of regular expression for urls that crawler will work on.

mycrawler.start(url_patterns) # start crawling

#################
data files
#################
three database (Berkeley DB) files will be generated.
queue.db
webpage.db
duplcheck.db
 
File Type Py Version Uploaded on Size # downloads
crawler-0.1.0.tar.gz (md5) Source 2011-01-17 4KB 420
crawler-0.1.0.win32.exe (md5) MS Windows installer 2.7 2011-01-17 65KB 246