Skip to main content

A Free, Commonsense-Enriched Natural Language Understander for English

Project description

MontyLingua is a free*, commonsense-enriched, end-to-end natural language
understander for English. Feed raw English text into MontyLingua, and the output
will be a semantic interpretation of that text. Perfect for information
retrieval and extraction, request processing, and question answering. From
English sentences, it extracts subject/verb/object tuples, extracts adjectives,
noun phrases and verb phrases, and extracts people's names, places, events,
dates and times, and other semantic information. MontyLingua makes traditionally
difficult language processing tasks trivial!

Version 2.1 is substantially FASTER, MORE ACCURATE, and MORE RELIABLE than
version 1.3.1. It has now been tested across Windows, many flavors of UNIX, and
Mac OS X, and several flavors of Java, and is in use by several university
research projects and under several commercial settings.

MontyLingua differs from other natural language processing tools because:

* it is complete end-to-end.. input raw_text; output semantic interpretation
* not many dated tools and implementations sewn together; it is one
well-integrated implementation
* it does not require "training" and other fidgetting, and will work right
out-of-the-box
* it is enriched with "common sense" knowledge about the everyday world,
allowing it to escape many stupid interpretive mistakes. e.g.:
o "(NX the/DT mosquito/NN bit/NN NX) (NX the/DT boy/NN NX)" ==corrected==>
o "(NX the/DT mosquito/NN NX) (VX bit/VBD VX) (NX the/DT boy/NN NX)"
* it is lightweight and portable across platforms, written in portable
Python and also available as a compiled Java library
* it is easy to customize by allowing for a user lexicon

MontyLingua performs the following tasks over text:

1. MontyTokenizer - Tokenizes raw English text (sensitive to abbreviations),
and resolve contractions, e.g. "you're" ==> "you are"
2. MontyTagger - Part-of-speech tagging based on Brill94, enriched with
common sense.
3. MontyChunker - Lightning fast regular expression chunker
4. MontyExtractor - Extracts phrases and subject/verb/object triplets from
sentences
5. MontyLemmatiser - Strips inflectional morphology, i.e. changes verbs to
infinitive form and nouns to singular form
6. MontyNLGenerator - Uses MontyLingua's concise predicate-arg representation
to generate naturalistic English sentences and text summaries

Project details


Release history Release notifications | RSS feed

This version

2.1

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page