pdfminer 20090517
PDF parser and analyzer written entirely in Python.
Latest Version: 20091024
PDFMiner is a suite of programs that help extracting and analyzing text data of PDF documents. Unlike other PDF-related tools, it allows to obtain the exact location of texts in a page, as well as other extra information such as font information or ruled lines. It includes a PDF converter that can transform PDF files into other text formats (such as HTML). It has an extensible PDF parser that can be used for other purpoes instead of text analysis.
- Author: Yusuke Shinyama <yusuke at cs nyu edu>
- Home Page: http://www.unixuser.org/~euske/python/pdfminer/index.html
- Download URL: http://www.unixuser.org/~euske/python/pdfminer/pdfminer-dist-20090517.tar.gz
- Keywords: pdf, html, text, extraction, conversion, data mining
- License: MIT/X
-
Categories
- Development Status :: 4 - Beta
- Environment :: Console
- Intended Audience :: Developers
- Intended Audience :: Science/Research
- License :: OSI Approved :: MIT License
- Natural Language :: English
- Topic :: Scientific/Engineering :: Information Analysis
- Topic :: Software Development :: Libraries :: Python Modules
- Topic :: Text Processing :: Markup
- Topic :: Utilities
- Package Index Owner: euske
- DOAP record: pdfminer-20090517.xml
Log in to rate this package.
