High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.5.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ddfacc440142e67952be155bd31246983f89c31f555036af13794e95765440c8 |
|
MD5 | 79b7ac5c453121bc271069f560dafc1c |
|
BLAKE2b-256 | 882e15402af61657d96153ff3f2d40df98f1a7b0e76e1a744a01982255e2cbbd |
Hashes for glum-2.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e799e0334f0b3d1f466ba452fc20d1a786166b3d6d133b21c8894b6bf77467de |
|
MD5 | 8cf7c92799ea533f07abd8fc90cde58b |
|
BLAKE2b-256 | 8df7443a1841aef2b1e4369dc798b8dd941e78eb195fda610e7a70d8f51ccd9e |
Hashes for glum-2.5.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd1f58fa5e0fa6e8336f19d4e17c9f82817df7842c4b4cf108a7b84e63ac7632 |
|
MD5 | 8947ff06eca80879f96402449585b731 |
|
BLAKE2b-256 | d0dc995b79f681fd1dc74c799564633577f6a00babaad36e3dad0f68e53c9b3b |
Hashes for glum-2.5.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c738c6f9b500ec9827c71f86e877003a14e74b2bd3a8414407e1108838760c0c |
|
MD5 | 5a7a5280eeb17533b3de9925fe3a1a6b |
|
BLAKE2b-256 | 4dbcdc0b1cff7503f6dad4db0a35d32bedbbe9be68d1721de99682889dfef5dd |
Hashes for glum-2.5.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9a2b8cf5a996f2980f9f995d1e6a32df146e54d17ad2567696fece2075ef43f0 |
|
MD5 | e8faa0da9f98bb5ee8b6602134cdb136 |
|
BLAKE2b-256 | efcf15cd2a944391d83a040d999d92b37e504d6d9492d6c391fec6aeb7e3d578 |
Hashes for glum-2.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 42be3d047a3a913db6cc08f3e63a406195fa30d7abd9541cc8bd3463670fd8e4 |
|
MD5 | 4057b9f25f6caf9c00973a299c63a1a4 |
|
BLAKE2b-256 | 9c8ce59d1efb791b3a267eafc0ec71569c9ff253acdc449b4a8c229ddd719af6 |
Hashes for glum-2.5.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ceaaff82b681f68523e55e31eaabcc93d5bb7c9507296996b793adfd44fe1bf3 |
|
MD5 | cebce36cf5930295775579868c75d846 |
|
BLAKE2b-256 | 7518e13a990b4b84820243a956d10f92609621d935753358b06dfd91313d9088 |
Hashes for glum-2.5.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b59c390b80316b7544b16a0258e212d00611522190a2a78d7ea3234f77eb5a9f |
|
MD5 | bd7760ed60760b1fd0e5c3814bc69198 |
|
BLAKE2b-256 | d5d3ebec1815c1a34f7bd90540c0df74609781bfe788507a30056b13cd01a145 |
Hashes for glum-2.5.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c383afbea3d5f59cc08e07124f9c49ef49e36345f56c8e1f2968391a8a812eaa |
|
MD5 | 3110f6be0672779ba1ed30a54292c9d2 |
|
BLAKE2b-256 | 3c3333fe43892b933feda755510e457e2d7c2de1d89514b4b9adf66961cafa71 |
Hashes for glum-2.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d31f5de6f9760fde8cac670b3c4643dfd4902610c163cdc274335ce6ee96ac9e |
|
MD5 | d4f231861b7bca78b92b049f52c37d93 |
|
BLAKE2b-256 | 78f6bf154a75daa54e93ef501c845b2dc953579b35afd79d85ef5df896539caf |
Hashes for glum-2.5.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b221da28194972af8ca0e9c999d9decfdaad2ad3aec54c674b35fa6a6567819 |
|
MD5 | 6f7fd3e13c14e7ba97a5edacaecb25bc |
|
BLAKE2b-256 | 36f6ca72aba9850ecc5427bd0b55b146f05b09bf8f582a744c20fd7f42ebd3ed |
Hashes for glum-2.5.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 01bc1667a7d45452fad6fa1b50df232cef2bc15e5b8b03bb6054192caa533b58 |
|
MD5 | 65d59d389175add79835700fb48edb8a |
|
BLAKE2b-256 | 4e760177ddff0ef6bbe0d86746d08148b945a87dd4f3e3467dca87199169a1f0 |
Hashes for glum-2.5.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d3a387d4784911c7b4e5ce9fb76f05d49bc2ace032702d280d4c3d7d722ceb2 |
|
MD5 | 94d3166e3cfc974602c4cdcb6fa05410 |
|
BLAKE2b-256 | c41dcd7fee311cb23e85aa7f074e0a036a8db49439007f0b32c5944855bcbad8 |
Hashes for glum-2.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95cccab4e0832d73e3680440277455c3616e79364d65c0f7bc93a09819f08a85 |
|
MD5 | 0849897b8da34d311e33fccfd77b27a0 |
|
BLAKE2b-256 | 1bb42fdbca1a6083000310998021231f6283c320eb38939366bb0d67693d5261 |
Hashes for glum-2.5.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 614560893d0f50c0e0f85089620e8383c99c99d6a7422fe9dc19664fb7b0c8e8 |
|
MD5 | c12fe4e6677ae1aa63c071753c93dbb0 |
|
BLAKE2b-256 | ffa281fa03f650395157159f5bbf01dba3695b6da68b42fcaa1f0de455dfafe1 |
Hashes for glum-2.5.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 898c1f5c6694bc2386bbb1ba09cf35e3fff5014ad1c75f9b35a23f1fd4dc1470 |
|
MD5 | d0f46489cc3bce94f77d6c588d9ecf14 |
|
BLAKE2b-256 | 046b172161207c674206059ce3c1d9f7ac9041872d68b61269706dabf9b4054d |
Hashes for glum-2.5.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f18e9dca50b29ad021c880ed68f3f34ea12708e3eda9437303f4c7e43326af0 |
|
MD5 | 01292d217b742ea62a02ae4c0aa45ad7 |
|
BLAKE2b-256 | cb862c91bde12bfa0ba4b26c2829ee60b2d9e7bd29b6660a4461f12f4ce88ce0 |
Hashes for glum-2.5.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d450132bd46a01115c316a2dea1ffb3e853bc8918a4c7715650a6ae30c8fa7c |
|
MD5 | de169a9576c433694aae733750f145bc |
|
BLAKE2b-256 | 9dfbfc357ce385946612ad291c3ae8e8f15e117c8ed2b709a87930b824383bfc |
Hashes for glum-2.5.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2655325f578f0b43df35dac905464f731be12d762aeebc773e12bcc5913b1327 |
|
MD5 | b32feffb13a2045bc5f19e2cc94a89bf |
|
BLAKE2b-256 | 35b6d489ae3a864500137095cecde978aee52b8d15311cd4d2e1da8754bbcb10 |
Hashes for glum-2.5.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 479b59651dbaed20473913136f522dba4b2347eb5db7823a778418f7b7955e31 |
|
MD5 | bee72df088c2749afee6829f3a3ab394 |
|
BLAKE2b-256 | bc385de5c9153f7e45d1c0d2c88c698b44fe6ba530dbe028c78dd43eba9751de |
Hashes for glum-2.5.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d55aa4148b040e07438848761dfcb149cce2d03b906ba87792c33a46712be04 |
|
MD5 | 777e2cdf8a2d14b5172bbc3ad4f6eed9 |
|
BLAKE2b-256 | 574ebedcc938ba92c739823887ca12c6991ce9245e0a5da2674b7b278ddfdb41 |
Hashes for glum-2.5.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a4203ed750018a30c1d7f133cc0e4f0f433856144a9226127073fd6442199a6 |
|
MD5 | 4ea6ea2b2c91b41c2ac275e6bd3c769e |
|
BLAKE2b-256 | 8a30a64741bd9807fb3d95493902c9cab78d31ce2a72caf3275ff4403edc2416 |