Tools to extend sklearn
Project description
sktools
sktools provides tools to extend sklearn, like several feature engineering based transformers.
Installation
To install sktools, run this command in your terminal:
$ pip install sktools
Documentation
Can be found in https://sktools.readthedocs.io
Usage
from sktools import IsEmptyExtractor
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
...
mod = Pipeline([
("impute-features", IsEmptyExtractor()),
("model", LogisticRegression())
])
...
Features
Here’s a list of features that sktools currently offers:
sktools.encoders.NestedTargetEncoder performs target encoding suited for variables with nesting.
sktools.encoders.QuantileEncoder performs target aggregation using a quantile instead of the mean.
sktools.preprocessing.CyclicFeaturizer converts numeric to cyclical features via sine and cosine transformations.
sktools.impute.IsEmptyExtractor creates binary variables indicating if there are missing values.
sktools.matrix_denser.MatrixDenser transformer that converts sparse matrices to dense.
sktools.quantilegroups.GroupedQuantileTransformer creates quantiles of a feature by group.
sktools.quantilegroups.PercentileGroupFeaturizer creates features regarding how an instance compares with a quantile of its group.
sktools.quantilegroups.MeanGroupFeaturizer creates features regarding how an instance compares with the mean of its group.
sktools.selectors.TypeSelector gets variables matching a type.
sktools.selectors.ItemsSelector allows to manually choose some variables.
sktools.ensemble.MedianForestRegressor applies the median instead of the mean when aggregating trees predictions.
sktools.linear_model.QuantileRegression sklearn style wrapper for quantile regression.
sktools.model_selection.BootstrapFold bootstrap cross-validator.
sktools.GradientBoostingFeatureGenerator Automated feature generation through gradient boosting.
License
MIT license
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.4 (2021-03-20)
Gradient boosting feature regressor
0.1.3 (2020-07-13)
Bootstrap cross-validation
Cyclic featurizer
0.1.2 (2020-06-24)
L1 linear model and random forest
Quantile encoder refactor
0.1.1 (2020-06-10)
Refactor code, add group featurizers
0.1.0 (2020-04-19)
First release on PyPI.