High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.2.1-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b929ae94473e466974dd5532b804a32112b7f4774c599e374e16df4cd54ac28 |
|
MD5 | 4e0f11ad2b86fc8ec87b6c7a763c7db0 |
|
BLAKE2b-256 | fb7cd9820c4c8e8b62080b7e3dfd234ef035b74a5a3b3a850715b912185ca0af |
Hashes for glum-2.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63b8617fde62ac0cf00984a46e588e895634d65fea9c57f48db107e5098cb4b2 |
|
MD5 | e64dfbc81e4beb31d2ca6e5c90cc2136 |
|
BLAKE2b-256 | a511fc31aeaa4827743b6c79f038edcf2dcb6750f59b1b801994dc80748a86a1 |
Hashes for glum-2.2.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b3bc7dc544f787ef54ecf0bf648e8d58c483fbcd6ea9f65ba5f999ced58e07a |
|
MD5 | 92d427691c0dab23cc3b2eb3d359d312 |
|
BLAKE2b-256 | 74b13ce9b651b918637d595a3c9a128e75031e275c2b10bf2ed45dbdfeef13c8 |
Hashes for glum-2.2.1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16fa543b85d0acbe630dfdbc0343b1c39b8682eb804c8b7328d2806fe674b9c5 |
|
MD5 | 07cafebf82294c378c8529d0bb814fe1 |
|
BLAKE2b-256 | 009fa259551c196bc37dbe4d2667a82d356693574a41b9e69d39755cf4e125d6 |
Hashes for glum-2.2.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c321bd4c5d525bc24e1c1e68c1accfa93e7cda2f2afd873e8f2cc283f6e163c1 |
|
MD5 | 5db86b560d92c03e60a59681f0a91986 |
|
BLAKE2b-256 | 125290f2f1a890045e1b0e5070762c17c003424de229416f8eaab90de290e8cb |
Hashes for glum-2.2.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae1771f57821a48b36026e505317abafebe3fc540db8b6207fbbeb0065815fa2 |
|
MD5 | 569248cea0fc1f65e3e509f0dde1443d |
|
BLAKE2b-256 | d252be180ed6868b6bcb2594c64398b22ddad9edd0e34dd55b6d432411c7aba9 |
Hashes for glum-2.2.1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc02de8cead316d48b898f8d857e6b5a53fff1b4fac8a59a50006a158b9711af |
|
MD5 | d5c570a7cf8b006a20e739192ee01e10 |
|
BLAKE2b-256 | 3514b11011478e7e10a262c03fa889c660d8f8a6bcb93364645963acbea7e437 |
Hashes for glum-2.2.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5eecea03f63318f79f3a05f30d8701f3a554b65861b0059354d3f5dac7840b52 |
|
MD5 | bc3a83fc615b8f99bbef32610b412e1e |
|
BLAKE2b-256 | a89306981fd486d56ca9a5091dc9e8bfe3fca2f4f6f709467017f82f73c2a14d |
Hashes for glum-2.2.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a012f388cb50357da13acfa121f56abb5b907b213ee18162216b5e8cb4976d3d |
|
MD5 | e92979d714c4b4e705a8c521396164c1 |
|
BLAKE2b-256 | 306d287fdf6f8d2f985e2d10c6aa3216b26d5f4a0a2d760859d71df6ed62c34c |
Hashes for glum-2.2.1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57504273de77018878a3c39431a901a0f1755f1a816092cd29bdb8d57f6e7737 |
|
MD5 | 89930ad809c6faa0941ea15c9886a850 |
|
BLAKE2b-256 | 2cdd8c3dafaf6b9f053647259cc9555d6c9c96b2e8b903b5039a1d639df2b69c |
Hashes for glum-2.2.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 676e9f8bb6f7b8c33d9a834a6c0c68e4b20b619f14bff3d785a2a3bd4799cacb |
|
MD5 | ef0e49a7c9b566d588c266ba38361148 |
|
BLAKE2b-256 | 03375bd815fdfb0868a429e928e6c7aec0f4ec73a12f6f4de4119169d0909eb1 |
Hashes for glum-2.2.1-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb3a900fe6f46c20bc8123f0dd054a334e13124f649f2e7ca8133fe7bf1f9e46 |
|
MD5 | e7bf5e32e657ce33891a6afaedaabdd3 |
|
BLAKE2b-256 | 0f05292dd41685f8490336af1de6a6761ff217ab018a317d80b24b9d5a43c23f |
Hashes for glum-2.2.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ccf7dd5a88e4ea7e55f50b6af347e1501135ab3296d2153922da6875a363acc |
|
MD5 | cc591beb15be58e8c00c3a01790edc9a |
|
BLAKE2b-256 | f5aee766617e2b962a57cd34f3e171dc439430e9a41f00ffd2a25fca95b7d865 |
Hashes for glum-2.2.1-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 699eaeed9b4374bf49052e42c3bf88c0fdbed69f9927050b6b7c6c589171c654 |
|
MD5 | dc31d054826dcce480020addbb3c31fa |
|
BLAKE2b-256 | 6a4129a43eebd48f7bee90e3b6b432d208927bb0cd55aafa6046479312c764ad |
Hashes for glum-2.2.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a58c3d836e506b336c3786016a65cc3ff65fd145f4f54da65110884c52ad07a8 |
|
MD5 | 7aa23e06c6e6e8801ddd0fc74fd1790d |
|
BLAKE2b-256 | 30e68e6f6f7a3bd1acdd7fe0ee9372b09a1c12ae87cb1820c63ff46afd596a69 |
Hashes for glum-2.2.1-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 057163384fb49e74cc98c4ecda7de5cbd55ea17258e1ce3a75c649ab4991ec88 |
|
MD5 | 277e43dff18b46e353ad02b0e6e42e9e |
|
BLAKE2b-256 | e1b12ddffad16f912554f9bc5d964eab3c988e4a11dc7cedaeaef04b305ba598 |
Hashes for glum-2.2.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2bf254bf586e3ce93c08aafcc6022dca77834ef03519b81a2a2f8a6f4b6b6b38 |
|
MD5 | d0adf8640010de3b34802bd52623083e |
|
BLAKE2b-256 | f1d7079dd93821e6042f4c1bbf60a7c868795c6a76b33c4aacecab7c1cf1c8ee |