textdata

Easily get clean data, direct from text or Python source

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

| |travisci| |version| |downloads| |versions| |impls| |wheel| |coverage| |br-coverage|

.. |travisci| image:: https://travis-ci.org/jonathaneunice/textdata.svg?branch=master
:alt: Travis CI build status
:target: https://travis-ci.org/jonathaneunice/textdata

.. |version| image:: http://img.shields.io/pypi/v/textdata.svg?style=flat
:alt: PyPI Package latest release
:target: https://pypi.python.org/pypi/textdata

.. |downloads| image:: http://img.shields.io/pypi/dm/textdata.svg?style=flat
:alt: PyPI Package monthly downloads
:target: https://pypi.python.org/pypi/textdata

.. |versions| image:: https://img.shields.io/pypi/pyversions/textdata.svg
:alt: Supported versions
:target: https://pypi.python.org/pypi/textdata

.. |impls| image:: https://img.shields.io/pypi/implementation/textdata.svg
:alt: Supported implementations
:target: https://pypi.python.org/pypi/textdata

.. |wheel| image:: https://img.shields.io/pypi/wheel/textdata.svg
:alt: Wheel packaging support
:target: https://pypi.python.org/pypi/textdata

.. |coverage| image:: https://img.shields.io/badge/test_coverage-100%25-6600CC.svg
:alt: Test line coverage
:target: https://pypi.python.org/pypi/textdata

.. |coverage| image:: https://img.shields.io/badge/branch_coverage-99%25-blue.svg
:alt: Test branch coverage
:target: https://pypi.python.org/pypi/textdata

One often needs to state data in program source. Python, however, needs its
lines indented *just so*. Multi-line strings therefore often have extra
spaces and newline characters you didn't really want. Many developers "fix"
this by using Python ``list`` literals, but that's
tedious, verbose, and often less legible.

The ``textdata`` package makes it easy to have clean, nicely-whitespaced
data specified in your program, but to get the data without extra whitespace
cluttering things up. It's permissive of the layouts needed to make Python
code look and work right, without reflecting those requirements in the
resulting data. For example::

data = lines("""
There was an old woman who lived in a shoe.
She had so many children, she didn't know what to do;
She gave them some broth without any bread;
Then whipped them all soundly and put them to bed.
""")

will result in::

['There was an old woman who lived in a shoe.',
"She had so many children, she didn't know what to do;",
'She gave them some broth without any bread;',
'Then whipped them all soundly and put them to bed.']

Note that the "extra" newlines and leading spaces have been
taken care of and discarded.

Other times, the data you need is almost, but not quite, a series of
words. A list of names, a list of color names--values that are mostly
single words, but sometimes have an embedded spaces. ``textdata`` has you
covered::

>>> words(' Billy Bobby "Mr. Smith" "Mrs. Jones" ')
['Billy', 'Bobby', 'Mr. Smith', 'Mrs. Jones']

Embedded quotes (either single or double) can be used to construct
"words" (or phrases) containing whitespace (including tabs and newlines).

``words``, like the other ``textdata`` facilities, allows you to
comment individual lines that would otherwise muck up string literals::

exclude = words("""
__pycache__ *.pyc *.pyo # compilation artifacts
.hg* .git* # repository artifacts
.coverage # code tool artifacts
.DS_Store # platform artifacts
""")

Yields::

['__pycache__', '*.pyc', '*.pyo', '.hg*', '.git*',
'.coverage', '.DS_Store']

Finally, you might wan to collect "paragraphs"--contiguous runs of text lines
that are delineated by blank lines. Markdown and RST document formats,
for example, use this convention. ``textdata`` makes it easy::

>>> rhyme = """
Hey diddle diddle,

The cat and the fiddle,
The cow jumped over the moon.
The little dog laughed,
To see such sport,

And the dish ran away with the spoon.
"""
>>> paras(rhyme)
[['Hey diddle diddle,'],
['The cat and the fiddle,',
'The cow jumped over the moon.',
'The little dog laughed,',
'To see such sport,'],
['And the dish ran away with the spoon.']]

Or if you'd like paras, but each paragraph in a single string::

>>> paras(rhyme, join="\n")
['Hey diddle diddle,',
'The cat and the fiddle,\nThe cow jumped over the moon.\nThe little dog laughed,\nTo see such sport,',
'And the dish ran away with the spoon.']

``textdata`` is all about conveniently grabbing the data you want
from text files and program source, and doing it in a highly
functional, well-tested way.
Take it for a spin today!

See `the full documentation
at Read the Docs <http://textdata.readthedocs.org/en/latest/>`_.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

2.4.1

Jan 23, 2019

2.4.0

Dec 21, 2018

2.3.3

Sep 20, 2018

2.3.1

Sep 15, 2018

2.3.0

Sep 15, 2018

2.2.0

Jul 7, 2018

2.1.0

Jul 4, 2018

2.0.1

Jun 4, 2018

1.7.3

Oct 13, 2017

1.7.2

May 30, 2017

1.7.1

Jan 31, 2017

1.7.0

Jan 31, 2017

1.6.2

Jan 23, 2017

1.6.1

Sep 15, 2015

1.6.0

Sep 2, 2015

1.5.1

Sep 2, 2015

1.5.0

Sep 2, 2015

1.4.5

Aug 26, 2015

This version

1.4.4

Aug 26, 2015

1.4.3

Aug 17, 2015

1.4.2

Aug 17, 2015

1.4.1

Aug 16, 2015

1.4.0

Aug 16, 2015

1.3.0

Aug 15, 2015

1.2.3

Aug 6, 2015

1.2.2

Aug 5, 2015

1.2.1

Aug 5, 2015

1.2.0

Aug 5, 2015

1.1.5

Aug 4, 2015

1.1.3

Jul 30, 2015

1.1.2

Jul 28, 2015

1.1.1

Jul 28, 2015

1.1.0

Jul 28, 2015

1.0.8

Jul 23, 2015

1.0.7

Jul 21, 2015

1.0.6

Jul 21, 2015

1.0.5

Jul 21, 2015

1.0.4

Jul 21, 2015

1.0.3

Nov 28, 2014

1.0.2

Aug 16, 2014

1.0.1

Feb 26, 2014

1.0

Feb 26, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

textdata-1.4.4.zip (15.3 kB view hashes)

Uploaded Aug 26, 2015 Source

textdata-1.4.4.tar.gz (8.9 kB view hashes)

Uploaded Aug 26, 2015 Source

Built Distribution

textdata-1.4.4-py2.py3-none-any.whl (8.4 kB view hashes)

Uploaded Aug 26, 2015 Python 2 Python 3

Hashes for textdata-1.4.4.zip

Hashes for textdata-1.4.4.zip
Algorithm	Hash digest
SHA256	`43d7b107582dc0db4137b85034ac3e29d6f932663bace3ee1a664425c4dfe09a`
MD5	`e2e7dd3841b2bc7b322aa476b6701501`
BLAKE2b-256	`353ec66df16b74733986146d9147397585060778ffef981090defe3a1eebc4a2`

Hashes for textdata-1.4.4.tar.gz

Hashes for textdata-1.4.4.tar.gz
Algorithm	Hash digest
SHA256	`ab615584d1d5033285f9cba43e45a5e3de55f803f805bfd441a0ce5d59807c6d`
MD5	`fd422fa639fffa837e2c46fc80c86319`
BLAKE2b-256	`7b4235a09ab755b813f5cb2816bf1962b300b075be7064e8fbd53da7bb08623e`

Hashes for textdata-1.4.4-py2.py3-none-any.whl

Hashes for textdata-1.4.4-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e96ba6cd8121f89d16b97adc4dac1679b769becf38554e7efcf9fbf98bae16b`
MD5	`b305e4765c8f5bd1fd16fc0f0f09e6ab`
BLAKE2b-256	`e9a42c105b231ac9e0b420688f9a86071651261701793c0cbfebbe33003e76c3`