Skip to main content

Specification Curve is a Python package that performs specification curve analysis.

Project description

Specification Curve

https://img.shields.io/pypi/v/specification_curve.svg https://img.shields.io/travis/aeturrell/specification_curve.svg Documentation Status

Specification Curve is a Python package that performs specification curve analysis.

Quickstart

Running

from specification_curve import specification_curve as sc
from specification_curve import example as scdata
df = scdata.load_example_data1()
y_endog = 'y2'
x_exog = ['x1', 'x2']
controls = ['c1', 'c2', 'group1', 'group2']
df_r = sc.spec_curve(df, y_endog, x_exog, controls,
                     cat_expand=['group2'])

produces

docs/images/example.png

Grey squares (black lines when there are many specifications) show whether a variable is included in a specification or not. Blue markers and error bars show whether the coefficient is significant (0.05).

Here’s another example:

from specification_curve import specification_curve as sc
import numpy as np
n_samples = 300
np.random.seed(1332)
x_1 = np.random.random(size=n_samples)
x_2 = np.random.random(size=n_samples)
x_3 = np.random.random(size=n_samples)
x_4 = np.random.randint(2, size=n_samples)
y = (0.8*x_1 + 0.1*x_2 + 0.5*x_3 + x_4*0.6 +
     + 2*np.random.randn(n_samples))
df = pd.DataFrame([x_1, x_2, x_3, x_4, y],
                  ['x_1', 'x_2', 'x_3', 'x_4', 'y']).T
# Set x_4 as a categorical variable
df['x_4'] = df['x_4'].astype('category')
df_r = sc.spec_curve(df, y_endog, x_exog, controls,
                     cat_expand=['x_4'])

Features

These examples use the first set of example data:

df = edata.load_example_data1()
  • Expand fixed effects into mutually exclusive groups using cat_expand

y_endog = 'y1'
x_exog = 'x1'
controls = ['c1', 'c2', 'group1', 'group2']
df_r = sc.spec_curve(df, y_endog, x_exog, controls,
                         cat_expand=['group1', 'group2'])
  • Mutually exclude two variables using exclu_grp

y_endog = 'y1'
x_exog = 'x1'
controls = ['c1', 'c2', 'group1', 'group2']
df_r = sc.spec_curve(df, y_endog, x_exog, controls,
                 exclu_grps=[['c1', 'c2']])
  • Use multiple independent or dependent variables

x_exog = ['x1', 'x2']
y_endog = 'y1'
controls = ['c1', 'c2', 'group1', 'group2']
df_r = sc.spec_curve(df, y_endog, x_exog, controls)

Similar Packages

In RStats, there is specr (which inspired many design choices in this package) and spec_chart. Some of the example data in this package is the same as in specr.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2020-07-27)

  • First release on PyPI.

0.1.1 (2020-08-01)

  • Multiple independent, dependent, and control variables implemented as lists. Mutually exclusive control variables implemented. Expansions of categorical variables into mutually exclusive fixed effects implemented.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

specification_curve-0.1.1.tar.gz (144.3 kB view hashes)

Uploaded Source

Built Distribution

specification_curve-0.1.1-py2.py3-none-any.whl (9.6 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page