scikits.statsmodels

Statistical computations and models for use with SciPy

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: BSD License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering

Project description

What it is

Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation of statistical models.

Main Features

regression: Generalized least squares (including weighted least squares and least squares with autoregressive errors), ordinary least squares.
glm: Generalized linear models with support for all of the one-parameter exponential family distributions.
discrete choice models: Poisson, probit, logit, multinomial logit
rlm: Robust linear models with support for several M-estimators.
tsa: Time series analysis models, including ARMA, AR, VAR
nonparametric : (Univariate) kernel density estimators
datasets: Datasets to be distributed and used for examples and in testing.
PyDTA: Tools for reading Stata .dta files into numpy arrays.
stats: a wide range of statistical tests
sandbox: There is also a sandbox which contains code for generalized additive models (untested), mixed effects models, cox proportional hazards model (both are untested and still dependent on the nipy formula framework), generating descriptive statistics, and printing table output to ascii, latex, and html. There is also experimental code for systems of equations regression, time series models, panel data estimators and information theoretic measures. None of this code is considered “production ready”.

Where to get it

Development branches will be on Github. This is where to go to get the most up to date code in the trunk branch. Experimental code is hosted here in branches and in developer forks. This code is merged to master often. We try to make sure that the master branch is always stable.

https://www.github.com/statsmodels/statsmodels

Source download of stable tags will be on SourceForge.

https://sourceforge.net/projects/statsmodels/

PyPi: http://pypi.python.org/pypi/scikits.statsmodels/

Installation from sources

In the top directory, just do:

python setup.py install

See INSTALL.txt for requirements or

http://statsmodels.sourceforge.net/

For more information.

License

Simplified BSD

Documentation

The official documentation is hosted on SourceForge.

http://statsmodels.sourceforge.net/

The sphinx docs are currently undergoing a lot of work. They are not yet comprehensive, but should get you started.

Our blog will continue to be updated as we make progress on the code.

http://scipystats.blogspot.com

Windows Help

The source distribution for Windows includes a htmlhelp file (statsmodels.chm). This can be opened from the python interpreter

>>> import scikits.statsmodels.api as sm
>>> sm.open_help()

Discussion and Development

All chatter will take place on the or scipy-user mailing list. We are very interested in receiving feedback about usability, suggestions for improvements, and bug reports via the mailing list or the bug tracker at

https://github.com/statsmodels/statsmodels/issues

There is also a google group at

http://groups.google.com/group/pystatsmodels

to discuss development and design issues that are deemed to be too specialized for the scipy-dev/user list.

Python 3

scikits.statsmodels has been ported and tested for Python 3.2. Python 3 version of the code can be obtained by running 2to3.py over the entire statsmodels source. The numerical core of statsmodels worked almost without changes, however there can be problems with data input and plotting. The STATA file reader and writer in iolib.foreign has not been ported yet. And there are still some problems with the matplotlib version for Python 3 that was used in testing. Running the test suite with Python 3.2 shows some errors related to foreign and matplotlib.

Release History

0.3.1

Removed academic-only WFS dataset.
Fix easy_install issue on Windows.

0.3.0

Changes that break backwards compatibility

Added api.py for importing. So the new convention for importing is:

import scikits.statsmodels.api as sm

Importing from modules directly now avoids unnecessary imports and increases the import speed if a library or user only needs specific functions.

sandbox/output.py -> iolib/table.py
lib/io.py -> iolib/foreign.py (Now contains Stata .dta format reader)
family -> families
families.links.inverse -> families.links.inverse_power
Datasets’ Load class is now load function.
regression.py -> regression/linear_model.py
discretemod.py -> discrete/discrete_model.py
rlm.py -> robust/robust_linear_model.py
glm.py -> genmod/generalized_linear_model.py
model.py -> base/model.py
t() method -> tvalues attribute (t() still exists but raises a warning)

Main changes and additions

Numerous bugfixes.
Time Series Analysis model (tsa)
- Vector Autoregression Models VAR (tsa.VAR)
- Autogressive Models AR (tsa.AR)
- Autoregressive Moving Average Models ARMA (tsa.ARMA) optionally uses Cython for Kalman Filtering use setup.py install with option –with-cython
- Baxter-King band-pass filter (tsa.filters.bkfilter)
- Hodrick-Prescott filter (tsa.filters.hpfilter)
- Christiano-Fitzgerald filter (tsa.filters.cffilter)
Improved maximum likelihood framework uses all available scipy.optimize solvers
Refactor of the datasets sub-package.
Added more datasets for examples.
Removed RPy dependency for running the test suite.
Refactored the test suite.
Refactored codebase/directory structure.
Support for offset and exposure in GLM.
Removed data_weights argument to GLM.fit for Binomial models.
New statistical tests, especially diagnostic and specification tests
Multiple test correction
General Method of Moment framework in sandbox
Improved documentation
and other additions

0.2.0

Main changes

renames for more consistency RLM.fitted_values -> RLM.fittedvalues GLMResults.resid_dev -> GLMResults.resid_deviance

GLMResults, RegressionResults: lazy calculations, convert attributes to properties with _cache

fix tests to run without rpy

expanded examples in examples directory

add PyDTA to lib.io – functions for reading Stata .dta binary files and converting them to numpy arrays

made tools.categorical much more robust

add_constant now takes a prepend argument

fix GLS to work with only a one column design

New

add four new datasets

A dataset from the American National Election Studies (1996)

Grunfeld (1950) investment data

Spector and Mazzeo (1980) program effectiveness data

A US macroeconomic dataset

add four new Maximum Likelihood Estimators for models with a discrete dependent variables with examples

Logit

Probit

MNLogit (multinomial logit)

Poisson

Sandbox

add qqplot in sandbox.graphics

add sandbox.tsa (time series analysis) and sandbox.regression (anova)

add principal component analysis in sandbox.tools

add Seemingly Unrelated Regression (SUR) and Two-Stage Least Squares for systems of equations in sandbox.sysreg.Sem2SLS

add restricted least squares (RLS)

0.1.0b1

initial release

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: BSD License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering

Release history Release notifications | RSS feed

This version

0.3.1

Aug 24, 2011

0.3.0

Jul 19, 2011

0.3.0rc1 pre-release

May 21, 2011

0.2.0

Feb 15, 2010

0.1.0b1 pre-release

Aug 31, 2009

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

scikits.statsmodels-0.3.1.zip (3.6 MB view hashes)

Uploaded Aug 24, 2011 Source

scikits.statsmodels-0.3.1.tar.gz (3.4 MB view hashes)

Uploaded Aug 24, 2011 Source

Hashes for scikits.statsmodels-0.3.1.zip

Hashes for scikits.statsmodels-0.3.1.zip
Algorithm	Hash digest
SHA256	`c538697296c983a01d5838b3e14f4f1b156d95a11878ace70e971eefad49c777`
MD5	`6e0b7aa29b62acc657b69a0bf43c4cae`
BLAKE2b-256	`ca2e8f552966d768e422386f5a2d01962554031400f6cba94c6c07651edf8bb5`

Hashes for scikits.statsmodels-0.3.1.tar.gz

Hashes for scikits.statsmodels-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`e31531bd603bd37a32bda50b034f7f7c941eb947a95f34890e0c084a0a5ff977`
MD5	`1f55b53d161544b95ca2709c9731c00c`
BLAKE2b-256	`36707ca38202429e7ab93b183d5b482b26a5e4b628ee8dd495017d6890cec0b3`