spider.py 0.5
Multithreaded crawling, reporting, and mirroring for Web and FTP
This module provides multithreaded crawling, reporting, and mirroring for Web and FTP in one convenient library. Crawling depth, maximum number of URLs to crawl, and maximum number of threads are user-configurable. Reports can be generated on external URLS, internal redirects to outside URLs, unparsable HTML, non-HTTP/FTP URLs, and broken links.
- Author: L. C. Rees <xanimal at users sf net>
- Maintainer: L. C. Rees <xanimal at users sf net>
- Home Page: http://psilib.sf.net/
- Download URL: http://prdownloads.sourceforge.net/psilib/spider.py
- Keywords: spider, robot , crawler, ftp crawler, ftp robot, ftp spider, web crawler, web robot, web spider, web-bot, link checker, bad link finder, site management, web reporting
- License: BSD
- Platform: Independent
-
Categories
- Development Status :: 4 - Beta
- License :: OSI Approved :: BSD License
- Operating System :: OS Independent
- Programming Language :: Python
- Topic :: Internet :: File Transfer Protocol (FTP)
- Topic :: Internet :: WWW/HTTP :: Indexing/Search
- Topic :: Internet :: WWW/HTTP :: Site Management
- Topic :: Internet :: WWW/HTTP :: Site Management :: Link Checking
- Topic :: System :: Archiving :: Mirroring
- Package Index Owner: lrees
- DOAP record: spider.py-0.5.xml
Log in to rate this package.
