Skip to main content

A toolkit for extracting chemical information from the scientific literature.

Project description

http://img.shields.io/pypi/v/ChemDataExtractor.svg?style=flat-square http://img.shields.io/pypi/l/ChemDataExtractor.svg?style=flat-square http://img.shields.io/travis/mcs07/ChemDataExtractor.svg?style=flat-square

ChemDataExtractor is a toolkit for extracting chemical information from the scientific literature.

Features

  • HTML, XML and PDF document readers

  • Chemistry-aware natural language processing pipeline

  • Chemical named entity recognition

  • Rule-based parsing grammars for property and spectra extraction

  • Table parser for extracting tabulated data

  • Document processing to resolve data interdependencies

Installation

To install ChemDataExtractor, simply run:

pip install chemdataextractor

Or if you are an Anaconda user, run:

conda install -c chemdataextractor chemdataextractor

Alternatively, try one of the other installation options.

Documentation

Full documentation is available at http://chemdataextractor.org/docs

License

ChemDataExtractor is licensed under the MIT license, a permissive, business-friendly license for open source software.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page