datashell

Because namespaces are a honking great idea, but loading a gazillion packages to take a quick peek at some data is not.

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: ISC License (ISCL)
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: Information Analysis

Project description

If you have a statistics background, you’re probably familiar with R. And if you’ve been using R for a while, the thing you start to appreciate most is that once you have that terminal open, you can get down to business pretty much immediately. Dumping functionality as diverse as numerical optimization, linear regression and the cumulative distribution function for a Poisson distribution all into the global namespace is probably not a good idea, but boy is it useful for quick data exploration. That’s what Data Shell does for Python.

Install with

pip3 install datashell
datashell-install

Open up an IPython-based data shell for Python 3 by typing datashell into your terminal. For inline plotting, use datashell-qt instead.

Pro tip: alias these shells to something shorter. For example, put alias dash=datashell and alias dashi=datashell-qt into your ~/.bashrc or wherever your shell customizations live.

Convenience functions

Currently, it loads convenience functions from math, random, numpy, scipy.stats, statsmodels, sympy as well as pandas.

All functions are lazy-loaded, so startup time is not much different than a regular IPython terminal.

Data shell does a star import of various packages into the global namespace, but also keeps them available under their respective namespaces, so you can access functionality both ways.

To give just one example, once you’re in your IPython data shell, a linear regression on a dataset in your working directory is simply:

ols('y ~ x', data=tables.test).fit().summary()

Behind the scenes, this will load statsmodels.formulas.api.ols to perform a linear regression, and tables.test will load test.csv.

Datashell can also be used in (non-interactive) scripts:

from datashell import *
diff(2*x**2)

(Though at some point you’ll probably want to clean things up and do proper imports.)

Data autoloader

Data shell also includes a Pandas autoloader for CSV files: you can access a Pandas DataFrame of ./subdir/myfile.csv from tables.subdir.myfile.

Useful shortcuts

from math: ceil, floor, log, factorial, sin and pretty much anything you’d find on a good calculator
from random: shuffle, choice, sample and friends
from sympy: expand, factor, simplify to simplify mathematical expressions, diff to differentiate, integrate to integrate (many one-letter variables are also predefined: a-e, o-s and u-z)
from scipy.optimize: minimize
from scipy.stats: describe, itemfreq, relfreq, kurtosis, mode, moment, skew, pearsonr, spearmanr and others
from scipy.stats.contingency: expected_freq, margins
from scipy.stats.distributions: cdf, pdf, ppf, sf, rvs and various other functions on statistical distributions from normal to gamma
from statsmodels.api: datasets and families (for use with generalized linear models)
from statsmodels.formula.api: ols and gls

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: ISC License (ISCL)
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: Information Analysis

Release history Release notifications | RSS feed

This version

0.4.4

Feb 11, 2016

0.4.3

Feb 11, 2016

0.4.2

Feb 11, 2016

0.4.1

Feb 11, 2016

0.4.0

Feb 9, 2016

0.3.1

Sep 29, 2015

0.3.0

Sep 23, 2015

0.2.0

Sep 16, 2015

0.1.0

Sep 16, 2015

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datashell-0.4.4.tar.gz (5.1 kB view hashes)

Uploaded Feb 11, 2016 Source

Hashes for datashell-0.4.4.tar.gz

Hashes for datashell-0.4.4.tar.gz
Algorithm	Hash digest
SHA256	`7c112b45f1c51c741a59e0331303e5f9ed3374a7c9636a29fef3d43d8f7f0af4`
MD5	`526fde9c60fbcc5c1dada853fe047e07`
BLAKE2b-256	`94e16e83d530ce2d490aadf06e5ab5235fa2f9e948b1b7a0fb8412e6a24016ae`