pdfminer 20090330
PDF parser and analyzer written entirely in Python.
Latest Version: 20110515
PDFMiner is a suite of programs that aims to help extracting or analyzing text data from PDF documents. Unlike other PDF-related tools, it allows to obtain the exact location of texts in a page, as well as other layout information such as font size or font name, which could be useful for analyzing the document. It can be also used as a basis for a full-fledged PDF interpreter.
- Author: Yusuke Shinyama
- Home Page: http://www.unixuser.org/~euske/python/pdfminer/index.html
- Keywords: pdf, html, text, extraction, conversion, data mining
- License: MIT/X
-
Categories
- Development Status :: 4 - Beta
- Environment :: Console
- Intended Audience :: Developers
- Intended Audience :: Science/Research
- License :: OSI Approved :: MIT License
- Natural Language :: English
- Topic :: Scientific/Engineering :: Information Analysis
- Topic :: Software Development :: Libraries :: Python Modules
- Topic :: Text Processing :: Markup
- Topic :: Utilities
- Package Index Owner: euske
- DOAP record: pdfminer-20090330.xml
