Specification Curve is a Python package that performs specification curve analysis.
Project description
Specification Curve
Specification Curve is a Python package that performs specification curve analysis.
Free software: MIT license
Documentation: https://specification-curve.readthedocs.io.
Quickstart
Running
from specification_curve import specification_curve as sc
from specification_curve import example as scdata
df = scdata.load_example_data1()
y_endog = 'y2'
x_exog = ['x1', 'x2']
controls = ['c1', 'c2', 'group1', 'group2']
df_r = sc.spec_curve(df, y_endog, x_exog, controls,
cat_expand=['group2'])
produces
Grey squares (black lines when there are many specifications) show whether a variable is included in a specification or not. Blue markers and error bars show whether the coefficient is significant (0.05).
Here’s another example:
from specification_curve import specification_curve as sc
import numpy as np
n_samples = 300
np.random.seed(1332)
x_1 = np.random.random(size=n_samples)
x_2 = np.random.random(size=n_samples)
x_3 = np.random.random(size=n_samples)
x_4 = np.random.randint(2, size=n_samples)
y = (0.8*x_1 + 0.1*x_2 + 0.5*x_3 + x_4*0.6 +
+ 2*np.random.randn(n_samples))
df = pd.DataFrame([x_1, x_2, x_3, x_4, y],
['x_1', 'x_2', 'x_3', 'x_4', 'y']).T
# Set x_4 as a categorical variable
df['x_4'] = df['x_4'].astype('category')
df_r = sc.spec_curve(df, y_endog, x_exog, controls,
cat_expand=['x_4'])
Features
These examples use the first set of example data:
df = edata.load_example_data1()
Expand fixed effects into mutually exclusive groups using cat_expand
y_endog = 'y1'
x_exog = 'x1'
controls = ['c1', 'c2', 'group1', 'group2']
df_r = sc.spec_curve(df, y_endog, x_exog, controls,
cat_expand=['group1', 'group2'])
Mutually exclude two variables using exclu_grp
y_endog = 'y1'
x_exog = 'x1'
controls = ['c1', 'c2', 'group1', 'group2']
df_r = sc.spec_curve(df, y_endog, x_exog, controls,
exclu_grps=[['c1', 'c2']])
Use multiple independent or dependent variables
x_exog = ['x1', 'x2']
y_endog = 'y1'
controls = ['c1', 'c2', 'group1', 'group2']
df_r = sc.spec_curve(df, y_endog, x_exog, controls)
Similar Packages
In RStats, there is specr (which inspired many design choices in this package) and spec_chart. Some of the example data in this package is the same as in specr.
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.0 (2020-07-27)
First release on PyPI.
0.1.1 (2020-08-01)
Multiple independent, dependent, and control variables implemented as lists. Mutually exclusive control variables implemented. Expansions of categorical variables into mutually exclusive fixed effects implemented.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for specification_curve-0.1.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18deda64795c71cafb780d74093f372a72d5b31743d3f4ea19db155194a32860 |
|
MD5 | bd3a121bf782e859ae51677cd175a07f |
|
BLAKE2b-256 | e340ed4440c0f68e7a762feb68f6165bf1d31dffc11f1e86a3ca5392bbc53334 |
Hashes for specification_curve-0.1.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd74eb7203c09b0ede8c978805608eccf906b124c972a927f1263d9874c288db |
|
MD5 | 8aaff75f062f7f63fde92473709ea5e1 |
|
BLAKE2b-256 | 6834ddaebb07cb3eccc7b9fbf6881fa400e181647e2bfd87d33b74d5c766f2cc |