textblob-de

German language support for TextBlob.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

German language support for TextBlob by Steven Loria.

This python package is being developed as a TextBlob Language Extension. See Extension Guidelines for details.

Features

All directly accessible textblob_de classes (e.g. Sentence() or Word()) are now initialized with default models for German
Properties or methods that do not yet work for German now raise a NotImplementedError
German sentence boundary detection and tokenization (NLTKPunktTokenizer)
Consistent use of specified tokenizer for all tools (NLTKPunktTokenizer or PatternTokenizer)
Part-of-speech tagging (PatternTagger) with keyword include_punc=True (defaults to False)
Parsing (PatternParser) with keyword lemmata=True (defaults to False)
Noun Phrase Extraction (PatternParserNPExtractor)
Lemmatization (PatternParserLemmatizer)
Polarity detection (PatternAnalyzer) - Still EXPERIMENTAL, does not yet have information on subjectivity
Supports Python 2 and 3
See working features overview for details

Installing/Upgrading

$ pip install -U textblob-de
$ python -m textblob.download_corpora

Or the latest development release (apparently this does not always work on Windows see issues #1744/5 for details):

$ pip install -U git+https://github.com/markuskiller/textblob-de.git@dev
$ python -m textblob.download_corpora

Usage

>>> from textblob_de import TextBlobDE as TextBlob
>>> text = '''Heute ist der 3. Mai 2014 und Dr. Meier feiert seinen 43. Geburtstag.
Ich muss unbedingt daran denken, Mehl, usw. für einen Kuchen einzukaufen. Aber leider
habe ich nur noch EUR 18.50 in meiner Brieftasche.'''
>>> blob = TextBlob(text)
>>> blob.sentences
[Sentence("Heute ist der 3. Mai 2014 und Dr. Meier feiert seinen 43. Geburtstag."),
 Sentence("Ich muss unbedingt daran denken, Mehl, usw. für einen Kuchen einzukaufen."),
 Sentence("Aber leider habe ich nur noch EUR 18.50 in meiner Brieftasche.")]
>>> blob.tokens
WordList(['Heute', 'ist', 'der', '3.', 'Mai', ...]
>>> blob.tags
[('Heute', 'RB'), ('ist', 'VB'), ('der', 'DT'), ('3.', 'LS'), ('Mai', 'NN'),
('2014', 'CD'), ...]
# not perfect, but a start (relies heavily on parser accuracy)
>>> blob.noun_phrases
WordList(['Mai 2014', 'Dr. Meier', 'seinen 43. Geburtstag', 'Kuchen einzukaufen',
'meiner Brieftasche'])

>>> blob = TextBlob("Das Auto ist sehr schön.")
>>> blob.parse()
'Das/DT/B-NP/O Auto/NN/I-NP/O ist/VB/B-VP/O sehr/RB/B-ADJP/O schön/JJ/I-ADJP/O'
>>> from textblob_de import PatternParser
>>> blob = TextBlob(text, parser=PatternParser(lemmata=True))
'Das/DT/B-NP/O/das Auto/NN/I-NP/O/auto ist/VB/B-VP/O/sein sehr/RB/B-ADJP/O/sehr' \
'schön/JJ/I-ADJP/O/schön ././O/O/.'
>>> from textblob_de import PatternTagger
>>> blob = TextBlob(text, pos_tagger=PatternTagger(include_punc=True))
[('Das', 'DT'), ('Auto', 'NN'), ('ist', 'VB'), ('sehr', 'RB'), ('schön', 'JJ'), ('.', '.')]

>>> blob = TextBlob("Das Auto ist sehr schön.")
>>> blob.sentiment
(1.0, 0.0)
>>> blob = TextBlob("Das ist ein hässliches Auto.")
>>> blob.sentiment
(-1.0, 0.0)

>>> blob.words.lemmatize()
WordList(['das', 'sein', 'ein', 'hässlich', 'Auto'])
>>> from textblob_de.lemmatizers import PatternParserLemmatizer
>>> _lemmatizer = PatternParserLemmatizer()
>>> _lemmatizer.lemmatize("Das ist ein hässliches Auto.")
[('das', 'DT'), ('sein', 'VB'), ('ein', 'DT'), ('hässlich', 'JJ'), ('Auto', 'NN')]

Requirements

Python >= 2.6 or >= 3.3

TODO

TextBlob Extension: textblob-cmd (command-line wrapper for TextBlob, basically TextBlob for files
TextBlob Extension: textblob-rftagger (wrapper class for RFTagger)
TextBlob Extension: textblob-stanfordparser (wrapper class for StanfordParser via NLTK)
TextBlob Extension: textblob-berkeleyparser (wrapper class for BerkeleyParser)
TextBlob Extension: textblob-sent-align (sentence alignment for parallel TextBlobs)
TextBlob Extension: textblob-converters (various input and output conversions)
Additional PoS tagging options, e.g. NLTK tagging (NLTKTagger)
Improve noun phrase extraction (e.g. based on RFTagger output)
Improve sentiment analysis (find suitable subjectivity scores)
Improve functionality of Sentence() and Word() objects
Adapt more tests from textblob main package (esp. for TextBlobDE() in test_blob.py)

License

MIT licensed. See the bundled LICENSE file for more details.

Changelog

0.2.3 (26/07/2014)

Lemmatizer: PatternParserLemmatizer() extracts lemmas from Parser output
Improved polarity analysis through look-up of lemmatised word forms

0.2.2 (22/07/2014)

Option: Include punctuation in tags/pos_tags properties (b = TextBlobDE(text, tagger=PatternTagger(include_punc=True)))
Added BlobberDE() class initialized with German models
TextBlobDE(), Sentence(), WordList() and Word() classes are now all initialized with German models
Restored complete API compatibility with textblob.tokenizers module of textblob main package

0.2.1 (20/07/2014)

Noun Phrase Extraction: PatternParserNPExtractor() extracts NPs from Parser output
Refactored the way TextBlobDE() passes on arguments and keyword arguments to individual tools
Backwards-incompatible: Deprecate parser_show_lemmata=True keyword in TextBlob(). Use parser=PatternParser(lemmata=True) instead.

0.2.0 (18/07/2014)

vastly improved tokenization (NLTKPunktTokenizer and PatternTokenizer with tests)
consistent use of specified tokenizer for all tools
TextBlobDE with initialized default models for German
Parsing (PatternParser) plus test_parsers.py
EXPERIMENTAL implementation of Polarity detection (PatternAnalyzer)
first attempt at extracting German Polarity clues into de-sentiment.xml
tox tests passing for py26, py27, py33 and py34

0.1.3 (09/07/2014)

First release on PyPI

0.1.0 - 0.1.2 (09/07/2014)

First release on github
A number of experimental releases for testing purposes
Adapted version badges, tests & travis-ci config
Code adapted from sample extension textblob-fr
Language specific linguistic resources copied from pattern-de

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.4.3

Jan 3, 2019

0.4.2

May 2, 2015

0.4.1

Oct 3, 2014

0.4.0

Sep 17, 2014

0.3.1

Aug 29, 2014

0.3.0

Aug 14, 2014

0.2.9

Aug 14, 2014

0.2.8

Aug 14, 2014

0.2.7

Aug 13, 2014

0.2.6

Aug 4, 2014

0.2.5

Aug 4, 2014

0.2.4

Aug 4, 2014

This version

0.2.3

Jul 26, 2014

0.2.2

Jul 22, 2014

0.2.1

Jul 20, 2014

0.2.0

Jul 18, 2014

0.1.3

Jul 9, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textblob-de-0.2.3.tar.gz (478.7 kB view hashes)

Uploaded Jul 26, 2014 Source

Built Distribution

textblob_de-0.2.3-py2.py3-none-any.whl (484.3 kB view hashes)

Uploaded Jul 26, 2014 Python 2 Python 3

Hashes for textblob-de-0.2.3.tar.gz

Hashes for textblob-de-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`d1c54cc2dcfee460dbb76898c809922bc9176b5299442243517ed60cbad89614`
MD5	`ff88f07553c6218aaf6af14ebd09ab5c`
BLAKE2b-256	`3eb5892a84d2a80bdb8abaae05d7ce580d9384069c3fd4dd69ff8b8010a1db34`

Hashes for textblob_de-0.2.3-py2.py3-none-any.whl

Hashes for textblob_de-0.2.3-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`ee42d9fd71e78cb122ac89a4c2c1d35795c9920b6bf3a61b65d39211f66fa372`
MD5	`3b8534a49c6eb47884fd72c6ff9d1913`
BLAKE2b-256	`cca4ff8d76501b25ba0c5754a681148625f22ecd9de841547191accee13d6a56`

textblob-de 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Features

Installing/Upgrading

Usage

Requirements

TODO

License

Changelog

0.2.3 (26/07/2014)

0.2.2 (22/07/2014)

0.2.1 (20/07/2014)

0.2.0 (18/07/2014)

0.1.3 (09/07/2014)

0.1.0 - 0.1.2 (09/07/2014)

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution