A library to parse PMML models into Scikit-learn estimators.
Project description
sklearn-pmml-model
A library to effortlessly import models trained on different platforms and with programming languages into scikit-learn in Python. First export your model to PMML (widely supported). Next, load the exported PMML file with this library, and use the class as any other scikit-learn estimator.
Installation
The easiest way is to use pip:
$ pip install sklearn-pmml-model
Status
The library currently supports the following models:
Model | Classification | Regression | Categorical features |
---|---|---|---|
Decision Trees | ✅ | ✅ | ✅1 |
Random Forests | ✅ | ✅ | ✅1 |
Gradient Boosting | ✅ | ✅ | ✅1 |
Linear Regression | ✅ | ✅ | ✅3 |
Ridge | ✅2 | ✅ | ✅3 |
Lasso | ✅2 | ✅ | ✅3 |
ElasticNet | ✅2 | ✅ | ✅3 |
Gaussian Naive Bayes | ✅ | ✅3 | |
Support Vector Machines | ✅ | ✅ | ✅3 |
Nearest Neighbors | ✅ | ✅ | |
Neural Networks | ✅ | ✅ |
1 Categorical feature support using slightly modified internals, based on scikit-learn#12866.
2 These models differ only in training characteristics, the resulting model is of the same form. Classification is supported using PMMLLogisticRegression
for regression models and PMMLRidgeClassifier
for general regression models.
3 By one-hot encoding categorical features automatically.
Example
A minimal working example (using this PMML file) is shown below:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier
# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)
clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
clf.predict(Xte)
clf.score(Xte, yte)
More examples can be found in the subsequent packages: tree, ensemble, linear_model, naive_bayes, svm, neighbors and neural_network.
Benchmark
Depending on the data set and model, sklearn-pmml-model
is between 5 and a 1000 times faster than competing libraries, by leveraging the optimization and industry-tested robustness of sklearn
. Source code for this benchmark can be found in the corresponding jupyter notebook.
Running times (load + predict, in seconds)
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | PyPMML |
0.773291 | 0.77384 | 0.777425 | 0.895204 | 0.902355 |
sklearn-pmml-model |
0.005813 | 0.006357 | 0.002693 | 0.108882 | 0.121823 | |
Breast cancer | PyPMML |
3.849855 | 3.878448 | 3.83623 | 4.16358 | 4.13766 |
sklearn-pmml-model |
0.015723 | 0.011278 | 0.002807 | 0.146234 | 0.044016 |
Improvement
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | Improvement | 133× | 122× | 289× | 8× | 7× |
Breast cancer | Improvement | 245× | 344× | 1,367× | 28× | 94× |
Development
Prerequisites
Tests can be run using Py.test. Grab a local copy of the source:
$ git clone http://github.com/iamDecode/sklearn-pmml-model
$ cd sklearn-pmml-model
create a virtual environment and activating it:
$ python3 -m venv venv
$ source venv/bin/activate
and install the dependencies:
$ pip install -r requirements.txt
The final step is to build the Cython extensions:
$ python setup.py build_ext --inplace
Testing
You can execute tests with py.test by running:
$ python setup.py pytest
Contributing
Feel free to make a contribution. Please read CONTRIBUTING.md for more details.
License
This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for sklearn_pmml_model-1.0.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | db52a76869a080d1bcec1245c7f99698575c0a946dc6069cb0588e7be1d656cb |
|
MD5 | b1fbd6dcc9f3de3c612160c0b17270c7 |
|
BLAKE2b-256 | dceb97ee93ab6b3b010bff5bf91a422c6b66e2a6a79c372797aa913079f44350 |
Hashes for sklearn_pmml_model-1.0.2-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 271144f321721bef6ed0c215cca9778e80b5b5c096f5481a16c50587e8ef3a75 |
|
MD5 | bd219db5323518f4da84a955ecc7bc08 |
|
BLAKE2b-256 | 0160d0a4449c6a89a7c21af97eb55a1ff095cbb1503e1d2b86f3f5da324c25a4 |
Hashes for sklearn_pmml_model-1.0.2-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 816f37e8683468a622e74310d5b22c89a85115f846ea793392a3a53978e4f67d |
|
MD5 | f59c8741f53fbccd1d4cd2019352494b |
|
BLAKE2b-256 | 3feca3e140e7b3322234d8f422e63de640eef76369e896ce5c34b2e9ef92e663 |
Hashes for sklearn_pmml_model-1.0.2-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e368e0fba0b9d213a4f0fbb14f435fa5488adaee32cfb4bed70b008f8ecdd23 |
|
MD5 | 6611fa30c4e5b9b7baae50437c6881a1 |
|
BLAKE2b-256 | 682b2d9dc1640ff126511afce0410095adc8ae356f38370bd1146b003bbc725a |
Hashes for sklearn_pmml_model-1.0.2-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 390aa4e4561719d12c2ea912040a6ab27299c6e9669d86ddd65a5a2b29263a0c |
|
MD5 | c081a1b240e2deae793f8081deea856a |
|
BLAKE2b-256 | 42fe5d48fa730d89303aa76037bfbf2531fd4da2dcb724fcf243109d94d30944 |
Hashes for sklearn_pmml_model-1.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0265d4045801ea6f6cfe1e4e61bced25da72d080ed260bf48a9479e003b693c |
|
MD5 | befb7f9a893f92558461892a19fc4129 |
|
BLAKE2b-256 | f3ea9f788fc23740a294b04f5841769858d52bf405a8a833b6a9490b13679bc4 |
Hashes for sklearn_pmml_model-1.0.2-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40a92881ebe2dbd4ce43ed0545835fe9f7c8169024789861938f7b8a10df3389 |
|
MD5 | 408d6b076277fc7957b2365ba505def0 |
|
BLAKE2b-256 | 89df0dc21a10b73103e0a19d5af4704d8641a3767b6c959a60b944b772abc660 |
Hashes for sklearn_pmml_model-1.0.2-cp311-cp311-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10af5e6c139295b3167cfd2d776c741edd33a82183a7966b61ab3b62ec7c9ccd |
|
MD5 | 12778c6e8f0eec1e4b6a123cad003737 |
|
BLAKE2b-256 | f7797a1a3f646368962c82dc5bbc59a1f32378dc84586d1846eb36475867326f |
Hashes for sklearn_pmml_model-1.0.2-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d25094b48dcc7ba1ec9539e3273964bd79b4a52b8bff034ce636cacdaace782b |
|
MD5 | 78f503b1be8284908dd0dcc8ccb2fdd6 |
|
BLAKE2b-256 | 0b5beca290d586b42142459b85f677fdff573ae4387886e1f52f2e7cafddfdc2 |
Hashes for sklearn_pmml_model-1.0.2-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed92d7c985b9d890d9bed798fc8b9e38b36609849359c9ad617fc67b2b0f485a |
|
MD5 | f0be7b13cec6d9288d6610f0ccf6aaa9 |
|
BLAKE2b-256 | f715ecd08e7d31c1c3405890ef9dad7bc689a57bff0a9c198f7f81e078f82a86 |
Hashes for sklearn_pmml_model-1.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fdb8975b83a64bd93f53b5c080e68433520cf767d9916e2012efd4b808d69da8 |
|
MD5 | 5efb77bdef1f04161cfe9c7ff1d0de0e |
|
BLAKE2b-256 | f3a64e2ccd524eaecfc810cbbe4a55872dd37b9824cd205ab1616767ce57f430 |
Hashes for sklearn_pmml_model-1.0.2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88a32bd3f66751089e660c0478ae46c3844b5e8d2c0f7a766094fabc132ddd5c |
|
MD5 | 883c4c0663aa420c2d46dd8e45a7d210 |
|
BLAKE2b-256 | 6209194667f222601ed014bdced4d024782f95587be66e74128ed10844a7ca33 |
Hashes for sklearn_pmml_model-1.0.2-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 821fc038aee5391ee2e9c26d9e46008991d3c9a902ae618d96c5abe46d2ba589 |
|
MD5 | 6b2d5ad0d64a70f0ce9cb1c2cde67920 |
|
BLAKE2b-256 | 4390268e9c22ef65794856ab90fded469744f1acf881bfa0f16bfcca776da6c1 |
Hashes for sklearn_pmml_model-1.0.2-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49a13b8d3d3694db2888dbf465f74ed5c3139a77ceeb9195a296895a8fcdac04 |
|
MD5 | 78703e6543f814103a2dc1eb32e9872f |
|
BLAKE2b-256 | 21c5f65a957fd9039b226a0d98fb550631cb1c63220de638afd52c5d0f797632 |
Hashes for sklearn_pmml_model-1.0.2-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78565e159ca89bd28fb180d54498064ec696648938dc09b67aa9a33c62664657 |
|
MD5 | eaa257d999adba516ad57ded531ae130 |
|
BLAKE2b-256 | 017db0bc6d467dff9befc0ce268417dc4c36fbf0906675235701dcc453b347b0 |
Hashes for sklearn_pmml_model-1.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7aef54c27822d17029052aaa75504486fc029a81c0c331fb7c14d77c06d9588 |
|
MD5 | 50122e218875e6a05ac86bb1035f01a1 |
|
BLAKE2b-256 | 12e540cceca026c43f48233d0673343966782e0ff77da44fcf39d12c863d6852 |
Hashes for sklearn_pmml_model-1.0.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06dc8d481bca6f83b2238d916a1d0ce6bb6073490567c3e839fdce66f394a4a7 |
|
MD5 | 17d575ccd3e3e51a52a346e93e4dd4e2 |
|
BLAKE2b-256 | 943ad0f458d13d539d61f7291739f9443c96f8793b2ba989f9937c0e5e63cb51 |
Hashes for sklearn_pmml_model-1.0.2-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 804b0d87f0bae77ab2fccd31e144aa78219454e9066a4338e7561c55d9190296 |
|
MD5 | 58d11523b9d4cfd643c604233d5ee591 |
|
BLAKE2b-256 | d920747aba2f026857ee252632542a274ba9484c1bf56ce0a1ef8be223928448 |
Hashes for sklearn_pmml_model-1.0.2-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86a8bba3359a6b63596117e90ce67c53ed9f83ccb072e2cc771d050d7381e19d |
|
MD5 | 76ad835c0bb4e4abc3c2ad85c1fac44c |
|
BLAKE2b-256 | f482a89e9245b1cf5c143525db8893610a982b9841d2882c60c942ba13a0e0ac |
Hashes for sklearn_pmml_model-1.0.2-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d4ef647f4923200c137dcdafc2a66a1fa2a14200d9ce5c89cc6d57cc7ea0279 |
|
MD5 | da205c869b9229476022db00e2a1affb |
|
BLAKE2b-256 | e5d27eaa8853c1718d4931d675b5468d0d52f1b5308ac88b6a10ef23a956c221 |
Hashes for sklearn_pmml_model-1.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e7ca3c914c4c02eb51f64951c8ca190e8af058adde8d08993dc11f136ef1cbc1 |
|
MD5 | a1215a566e69f1a7215b431f4634dda6 |
|
BLAKE2b-256 | b23da9dc64eb3a6a7499c337944e626f7b7929e3797010224125ee348033c845 |
Hashes for sklearn_pmml_model-1.0.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 686d7cc7fb3cbd06b6b41a675e0c9deff6872aba21a7fba34b785cb33c6a708a |
|
MD5 | f490ecbb198a6c9ddc235493b9092778 |
|
BLAKE2b-256 | d2f97d1571978069aa742f27526c5b57f5b4c98889c7bddc0c9de62a4011def5 |
Hashes for sklearn_pmml_model-1.0.2-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5cc159319c8e2d5f5e3f2cca992ec9f6e29ade94306664a336ef424bb2b53ba0 |
|
MD5 | a480a8b833942d5881670589fbabb219 |
|
BLAKE2b-256 | 99d43ed5e4aad08848cb1c9f86de8b02b31b8f4d6fdb1743bb388c7d7598e30f |
Hashes for sklearn_pmml_model-1.0.2-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85eb48e9a074a85f407e24b4dc6f98fb7405cb8bad6d95c31203ab113bb8ff0d |
|
MD5 | 83af4ced49e82c4036631ce6e0d9e243 |
|
BLAKE2b-256 | 74c13874f3c2534bfe467af9502a921f5e00e7930524f7f065e154e2515ce722 |
Hashes for sklearn_pmml_model-1.0.2-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04c47f3a4c9028cf93e1c4706427a08290d448d98aa0f38528df9470a0c8673c |
|
MD5 | 61d7a5043f906a5319f4e92734208032 |
|
BLAKE2b-256 | 8c4e6399b6fa8537aa9a3bbaf4a0b8d6f1035097c06fe69404ad41eb70f1ca19 |
Hashes for sklearn_pmml_model-1.0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ffd6355524d239a8a63e58ff209596471cf5a3fe4c7d81e33f67d5093e1f333 |
|
MD5 | 506b76e7b5b2eef86dc6b9f1254cfb35 |
|
BLAKE2b-256 | 18a744329ff7c965d4d734c0f0ba1b34c68e6cde9222c60ad701f3c24362d425 |
Hashes for sklearn_pmml_model-1.0.2-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b95de680a66fa2307676dc359ade81b946d7eabf741649f6c98cfd652a479367 |
|
MD5 | e487c623bd595869a88a1e58060de04e |
|
BLAKE2b-256 | 48a05647f299cc0c58ad577cc2dab6b504b61e91e4d5fb92057d4bf67d7084d4 |
Hashes for sklearn_pmml_model-1.0.2-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86c4cd4bcb7adca5ab5c9099468c1f967e8aff965a95e92965617f30ceb013de |
|
MD5 | 83cd129eed6ea282ff3a772f3de7a8b7 |
|
BLAKE2b-256 | beda057f1e53d1c5f7977b5f9c3d72bd610dd41dd4de5cb15c89d20b7a7df8e7 |
Hashes for sklearn_pmml_model-1.0.2-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed60b055e760361ebe5473dbd3476956bafcd3405eea672d5d1092379e2e483e |
|
MD5 | a0ddbfd71959a0c6d9b10c531ea35bab |
|
BLAKE2b-256 | 8861e95b5efc1388a7ca6bc7997b1e86f0557467cb2c2c75c331e6219e7ac2c2 |
Hashes for sklearn_pmml_model-1.0.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb6a39c11c2d427cc332ab67bb9cfcf6e54ace5e3e984a4bfae14799e5b726f8 |
|
MD5 | 56f225fe5632d739f40aa06c48535cc7 |
|
BLAKE2b-256 | 2324aa733e2207792db73828aeccbda8ef6f28bbd61221711a05a6222a49daf1 |
Hashes for sklearn_pmml_model-1.0.2-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f42ac1e87ebbfe336c6d1224c17224e85fa71b6426d055d420a38b049fc410d2 |
|
MD5 | 1e37573a48fc0e11971e4bbff37cb855 |
|
BLAKE2b-256 | b7667bb4bb53a299a7d33ec1eae601738d96501268e02931ee7216170cd384b4 |