toolshed

Tools for data

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Toolshed: Less Boiler-Plate
===========================

This is a collection of well-tested, simple modules and functions
that I use frequently

Files
-----

If you have a "proper" CSV file with quoting and such, use python's `csv`_
module.

If all you have is a file with a header and you want to get a dictionary
for each row::

>>> from toolshed import reader, header, nopen
>>> for d in reader('src/toolshed/tests/data/file_data.txt'):
... print d['a'], d['b'], d['c']
1 2 3
11 12 13
21 22 23

works the same for gzipped, bzipped, and .xls files and for stdin (via "-")
and for files over http/ftp::

>>> for drow in (d for d in reader('src/toolshed/tests/data/file_data.txt.gz') if int(d['a']) > 10):
... print drow['a'], drow['b'], drow['c']
11 12 13
21 22 23

if one can specify the header to a file without one using the `header=` kwarg.
If `header` is "ordered" then an OrderedDictionary will be used so that
drow.keys() and d.values() will return the values in the order they appeared in the file.

If `header` is a callable (a function or class) then, for each row, that
callable will be called for each row with a single argument which is the
list of columns in the future, it may be called as: callable(\*row) instead
of callable(row). **comments welcome**.

sometimes you just want the header::

>>> header('src/toolshed/tests/data/file_data.txt')
['a', 'b', 'c']

the `toolshed.nopen` can open a file over http, https, ftp, a gzipped file, a
bzip file, or a subprocess with the same syntax.

>>> nopen('src/toolshed/tests/data/file_data.txt.gz')
<gzip open file ... >
>>> nopen('|ls')
<open file '<fdopen>'...>

Shedskinner
-----------

Shedskin is a program that takes python scripts, infers the types based
on example input and generates fast C++ code that compiles to a python
extension module. Shedskinner is a decorator that automates this for a single
function. Use looks like::

from toolshed import shedskinner

@shedskinner((2, 12), long=True, fast_random=True):
def adder(a, b):
return a + b

Where here, we have decorated the adder function to make it a compiled, fast
version that accepts and returns integers. The (2, 12) are example arguments
to the function so that shedskin can infer types.
The keyword arguments are sent to the compiler (see:
https://gist.github.com/1036972) for more examples.

Links
-----

.. _`csv`: http://docs.python.org/library/csv.html

News
====
0.2.9
-----
support for bash process substition, e.g.: reader("|cmd <(some args)")

0.2.8
-----
reader supports streaming remote .gz files.

0.2.7
-----
* reader supports .xls files.

0.2.6
-----
* don't print an extra newline when reading empty stderr from a process.

0.2.5
-----
* allow splitting on none or on regexp.

0.2.4
-----
* if header is a callable, it's called for each row (instead of returning
dict).

0.2.3
-----
* if reader can accept the generator returned from reader()

0.2.2
-----
* if an integer is sent to nopen, then nopen(sys.argv[arg]) is returned.

0.2.1
-----
* fix handling when there's an exception in the loop that calls a process

0.2.0
-----
* better error message from Popen when using nopen("| something")

0.1.9
-----
* if the header argument to `reader` is "ordered" then an ordered
dictionary is used.

0.1.8
-----
* Add is_newer_b(apath, bpaths) to check that all b files are newer
than apath.

0.1.3
-----
* July 26 2011
* Allow ftp/http(s) paths as arguments to reader

0.1.1
-----
* use itertools.izip for speed improvement

0.1
---

*Release date: 15-Mar-2010*

* Initial project structure.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.4.6

Jun 18, 2016

0.4.5

Feb 22, 2016

0.4.4

Dec 3, 2014

0.4.2

Sep 10, 2014

0.4.1

Sep 3, 2014

0.4.0

Aug 26, 2014

0.3.9

Jun 16, 2014

0.3.8

Jun 10, 2014

0.3.7

Apr 29, 2014

0.3.6

Mar 11, 2014

0.3.5

Jan 30, 2014

0.3.4

Jan 27, 2014

0.3.3

Jan 27, 2014

0.3.2

Nov 1, 2013

0.3.1

Aug 14, 2013

0.3.0

Aug 14, 2013

This version

0.2.9

Dec 18, 2012

0.2.8

Dec 6, 2012

0.2.7

Nov 22, 2012

0.2.6

Nov 20, 2012

0.2.5

Oct 24, 2012

0.2.4

Jun 1, 2012

0.2.2

May 16, 2012

0.2.1

May 15, 2012

0.2.0

May 13, 2012

0.1.9

May 8, 2012

0.1.8

Apr 16, 2012

0.1.6

Mar 7, 2012

0.1.5

Jan 10, 2012

0.1.4

Dec 5, 2011

0.1.3

Jul 26, 2011

0.1.2

Jul 17, 2011

0.1.1

Jul 8, 2011

0.1

Jun 23, 2011

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toolshed-0.2.9.tar.gz (7.3 kB view hashes)

Uploaded Dec 18, 2012 Source

Hashes for toolshed-0.2.9.tar.gz

Hashes for toolshed-0.2.9.tar.gz
Algorithm	Hash digest
SHA256	`2066607d7fb9c86124fcede6ad707340de7f3cbbf80d1fa2db13cf7d4eea844d`
MD5	`a665488a48805f3455037d9c00c07e5b`
BLAKE2b-256	`92f1d921cee68bb61cff8eddae0c6cb070a8eb22a92410c1cde4a581b9ede84b`