Skip to main content

Performs analysis of the fit of a model.

Project description

[![Build Status](https://travis-ci.org/wsmorgan/analyzefit.svg?branch=master)](https://travis-ci.org/wsmorgan/analyzefit)[![Coverage Status](https://coveralls.io/repos/github/wsmorgan/analyzefit/badge.svg?branch=master)](https://coveralls.io/github/wsmorgan/analyzefit?branch=master)

# analyzefit

Analyze fit is a python package that performs standard analysis on the
fit of a regression model. The analysis class validate method will
create a residuals vs fitted plot, a quantile plot, a spread location
plot, and a leverage plot for the model provided as well as print the
accuracy scores for any metric the user likes. For example:

![alt_text](../master/support/images/validation.png)

If a detailed plot is desired then the plots can also be generated
individually using the methods res_vs_fit, quantile, spread_loc, and
leverage respectively. By default when the plots are created
individually they are rendered in an interactive inverontment using
the bokeh plotting package. For example:

![alt text](../master/support/images/interactive.pdf)

This allows the user to determine which points the model is failing to
predict.

Full API Documentation available at: [github pages](https://wsmorgan.github.io/analysefit/).

## Installing the code

To install analyzefit you may either pip install:

```
pip install analyzefit
```

or clone this repository and install manually:

```
python setup.py install
```

# Validating a Model

To use analyze fit simply pass the feature matrix, target values, and
the model to the analysis class then call the validate method, (or any
other plotting method). For example:

```
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data', header=None,sep="\s+")
df.columns = ["CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS","RAD","TAX","PTRATIO","B","LSTAT","MEDV"]
X = df.iloc[:,:-1].values
y = df[["MEDV"]].values
X_train, X_test,y_train,y_test = train_test_split(X,y, test_size=0.3,random_state=0)
slr = LinearRegression()
slr.fit(X_train,y_train)

an = analyze.analysis(X_train, y_train, slr)
an.validate()

an.validate(X=X_test, y=y_test, metrics=[mean_squared_error, r2_score)

an.res_vs_fit()

an.quantile()

an.spread_loc()

an.leverage()
```

## Python Packages Used

- numpy

- matplotlib

- bokeh

- sklearn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

analyzefit-0.3.3.tar.gz (8.6 kB view hashes)

Uploaded Source

Built Distribution

analyzefit-0.3.3-py3-none-any.whl (11.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page