High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.5.2-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c442690051ba28be4caf63fb182894080ce50b74f4e1b1b9fb76cb336afb4643 |
|
MD5 | 314c73b2c94d0d90ffaa5a16d863e245 |
|
BLAKE2b-256 | ac0eee56fa1c2f609c5ceb4ae4b1f134255d967bc7ce5ebde6753a0b874b1753 |
Hashes for glum-2.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5a8db8c65dab695fdec2d7f1848ae112e0e4f3cb429d334386f853ea01b8825 |
|
MD5 | eb9e92650b7072cd69437342f3d83e2b |
|
BLAKE2b-256 | d840bfa5aa87248a8b638a8660702ecd4041a3d64b2fa6503b97542ecc6608c2 |
Hashes for glum-2.5.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae365d6faa047704963ce38bac0f4f99bbccd2b33108c08fd1b13a57164cf6a8 |
|
MD5 | 981ac121986cef6fd52148d43254a3ca |
|
BLAKE2b-256 | c64a587d0ad100663df4c358ae164b4e5dbc2a1c1690df2e167387c2f7e49d2e |
Hashes for glum-2.5.2-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37e885a1381974394df5ffb7115b16c72440ce6d8ae8532502e57d544491fca8 |
|
MD5 | 0b44d0e009a1e46b2af351d8bf1f43d0 |
|
BLAKE2b-256 | a89f82bfbee3e6f80e63ee44cf8860646b577ec63f9d2262632519c3b89cacb4 |
Hashes for glum-2.5.2-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7930e62b669a9c3895afff5b9ecd1c949fd8714a241e2c5bf9b1144afb7eb5e0 |
|
MD5 | fde802cad41b1401491cd92b200d2c40 |
|
BLAKE2b-256 | 2d16a026f7948e6a357e7b60333cd77101ef93711d4441eac33277ab9ea3de0a |
Hashes for glum-2.5.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b707820c6c9dbd9f62f11dcea3aaad0933ea3e5a976ae19c69f700cd0ea1dee |
|
MD5 | 62654331f109a68fb93716644856568c |
|
BLAKE2b-256 | 07a3eaab9a58fa39b1c9323a1247ee2fe3f1fac09be4cb3e4d8b7123377ca04d |
Hashes for glum-2.5.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1928d86b28db7d39275eda81647ea1371f3a025b16f46c02069f72330f078ec1 |
|
MD5 | 4437463d260e64869ca470fa20bfb355 |
|
BLAKE2b-256 | 430ae9b3fc885e40001d13c97004e93c9e11814cfd57834bd29a944428c666ab |
Hashes for glum-2.5.2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7aa314a32651030d892cf261233bc6de65120c934cbe8555580cab5f240fd84f |
|
MD5 | 396a9c3e929384ab27c80bfd48582c5c |
|
BLAKE2b-256 | 74695df926e4b4b7bf7c09f208b344cf8c0bcbaeb112d558df7ea0a0cd0347f4 |
Hashes for glum-2.5.2-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55ce5f13f1a63ba2d461003ba0ad5cd2cd417e091755c1986b7f965502b2858c |
|
MD5 | 727c628d4e71952d6e5ae27df95a1adf |
|
BLAKE2b-256 | c62e0dc261af8fa815ac9accc613de0a7e6f17d533fd8c6c00b213e953bae7cb |
Hashes for glum-2.5.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4df9ab3dd8e7322c278cbded341f1162dd7690841fce3387e167cb605db85e91 |
|
MD5 | 7c1f9e8963d16c3875a2fb3c27d154b6 |
|
BLAKE2b-256 | 48f1e076c2829382d980f5d415a492b8a41f524962df1eb6862a5362dd580719 |
Hashes for glum-2.5.2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50b1b7f0a1c30c91cfbee23c89abb6efbc4a3ab58a709caaa3d4fae509ad13de |
|
MD5 | 271428b058e051096df5ca4a6a944e7f |
|
BLAKE2b-256 | 40aa96cd39af7a10c80b0d73318aac6419db2d813d071d149163e6b4648d3ebc |
Hashes for glum-2.5.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57259c47989645ee6ed05e1c1ee04bd13fd2b70651bba39a056716adf8f78012 |
|
MD5 | b1b4dc8e24d560e6a0c32745a877301f |
|
BLAKE2b-256 | ad6c454e3a2d28a67a79fb99b185ee3e2ffa338e960ed3db43a33fb6ff51c2d7 |
Hashes for glum-2.5.2-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b460b3f3d1c795dabfd19f4b0eb794ed98e2e2b05f0e112075836c8e321669a6 |
|
MD5 | 1602ae840f9e804564d60432bc5105c1 |
|
BLAKE2b-256 | 921a5eaf9011c18f7f93a6e57cce9236287ca2ed8352751ffd7057289ebe01c9 |
Hashes for glum-2.5.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de5b2376189d1677dba1eb7515e10be178e03825d0c4990e87e28ef386e19036 |
|
MD5 | 76d2a0f622a209cc98bfc9d9e44499c2 |
|
BLAKE2b-256 | 9fd07358713bfcfa4ec7b00fe84eb65fa8643e65b2205486b184388cec7afcea |
Hashes for glum-2.5.2-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6aef6371414e89120110abd5e0779bb8cd7be04e79274312ec3381e8a858c833 |
|
MD5 | 9115d841161fef22b151447f1d252e66 |
|
BLAKE2b-256 | e3800c795a94a81c178ff91b30d8aabb08d9901aca5dba02a792b3afa6106532 |
Hashes for glum-2.5.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 266b662b06257d13c2fa3df736c0aebc922072e01bd40cdb44454816ebeff5ef |
|
MD5 | 0bab32d2fec90bde4dbaf0fd4b0be228 |
|
BLAKE2b-256 | 54f812acf9cb7fa2ec51357c3871f34dbb28d2029ab33854eb642d694ef57922 |
Hashes for glum-2.5.2-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 190016e5ce0bd1130cd40059f4867d238d5a1396f4e2d024aea34bd8937d0fa4 |
|
MD5 | 641928c908f946de97a34ce963c9f43b |
|
BLAKE2b-256 | d03ae6e53e4cb6cc8601c49c2651ccbfdd7b5b2d939c01fa69bd50ea3a35f5cc |
Hashes for glum-2.5.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b086b0931f0981e49969b1dde5c627e89228932352203c3c42682d9ac407d435 |
|
MD5 | 3c57362ead25e443323a5d04098244e2 |
|
BLAKE2b-256 | 199ae9c988a96d58ce23d4bef3e8efa9293d973844a35e94fda795b44f334ff8 |
Hashes for glum-2.5.2-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82f62ea1fac3e70a75b37e631f34a21a3ef25aa6dcbd75c3a3558faa350060fd |
|
MD5 | 3019727688267f24e39cf4bd93210cf0 |
|
BLAKE2b-256 | 549e2727c41630438ae182bd037fd6d2473ec02f475b0175ea183a4a4821bb0b |
Hashes for glum-2.5.2-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d734f3f025a89501fc2e2a57a199e184e2f937d3f834930521f28c94b1d3b09 |
|
MD5 | 387698d3085ad2050816852cd4455384 |
|
BLAKE2b-256 | f1d2095961a37c2c8e66959e7b40ddc75a43d16dfcec0a448015367948246b73 |
Hashes for glum-2.5.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92968a51b6282c59cfe12d0d09a972e8867c4c959b447e3d3b59e238dcc2cacb |
|
MD5 | 74a9e01e4ee5b09e2d074fedfec47904 |
|
BLAKE2b-256 | d39bceacf506a965d41b3bbf6ee7cf76cf5f57bc9dbc9aeb6a85044cdd240118 |
Hashes for glum-2.5.2-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd56b915fde3117f94164f23160767df47199add65c3306c7d77cf5adbd50070 |
|
MD5 | 24a8ceddb7edfb8b5884933218e141e5 |
|
BLAKE2b-256 | a4138f9d86258888b01e96d203fbb6072c356a87d17cd550a53950cfeed8f4e7 |