HtmlList 2.2.2
Extract information from HTML pages that have some kind of a repetitive pattern
This package finds repetitive format patterns in an HTML page that contains one or more lists and extracts the sub-html text that creates the patterns. The idea is that in a typical HTML data page containing a list of items, there will be a repetitive pattern for the human eye (the page format). This pattern can be recognized automatically, and the data in the list can be extracted.
| File | Type | Py Version | Uploaded on | Size | # downloads |
|---|---|---|---|---|---|
| HtmlList-2.2.2-py2.6.egg (md5) | Python Egg | 2.6 | 2010-10-16 | 445KB | 554 |
| HtmlList-2.2.2.tar.gz (md5) | Source | 2010-10-16 | 351KB | 384 | |
| HtmlList-2.2.2.zip (md5) | Source | 2010-10-16 | 383KB | 350 | |
- Author: Erez Bibi
- Home Page: http://pyhtmllist.sourceforge.net/
- Keywords: HTML list information extraction repetitive pattern
- License: GPL
- Categories
- Package Index Owner: erezbibi
- DOAP record: HtmlList-2.2.2.xml
