High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-3.0.0a2-cp312-cp312-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c8dd2885be4424f065573cd1a05b2f427fae8489b4ab25bc9f6cf1c7f2f0bd1 |
|
MD5 | a10fb89f3cd524ae57ebb3049bfdfe3c |
|
BLAKE2b-256 | bd386a0925aada92967e39eb49a970dc5bde0bf23244c9af9d657bdc191445b1 |
Hashes for glum-3.0.0a2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 527cf0b28e176b490ae50767ab8e131e4368ddefabd0749550587a7e92c3189c |
|
MD5 | a62e973bed8fcf8b3a087ef82e25ef84 |
|
BLAKE2b-256 | 33bf798ac0b694b7b07ea48255e5db0e29fb1d1a3e890a7784a4b0aba0e02aed |
Hashes for glum-3.0.0a2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2abeab898abe48b662f2d2f73e55c471e4dae0f5b0d504c8d3989f4ce3255709 |
|
MD5 | a72e30a3fb676960a553df9dae5628f5 |
|
BLAKE2b-256 | 269c18869fb8a01f4ec463b884ea345e0d0a71cfe76ec2a956aaebf2d8e7b0e2 |
Hashes for glum-3.0.0a2-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0fcef59935985d0474cac97b680fadb5d8ea329ca7c7c68ccb202e64bbc7d348 |
|
MD5 | 8d33b16c3f56c08cbe528b6cbfef9f6b |
|
BLAKE2b-256 | 9b24809cba2549e7d52fd97a48a692fe49c23b5f7bfa1e545a83d8a15104da5a |
Hashes for glum-3.0.0a2-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9eee5eca66c259613cd65d10f09cfcbd4388326133b01fa64110171117512680 |
|
MD5 | 7b648c51e1bc491ab2290598df48f3e2 |
|
BLAKE2b-256 | 7f34b9581da4fc64b21b60923daf1b3a61e4775617d7e3db3f40b9aa432f6bcf |
Hashes for glum-3.0.0a2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4eaaa69db056b6d8a69ee6c1e28ee6d2f7742e74b79ae1e08ed2c7fe308621af |
|
MD5 | ec3343a49e0d0f11b1861755facd7148 |
|
BLAKE2b-256 | 04c92a522978cb2d5692531bdaed886c028ed5725fbb08ffd10a966e71126c8d |
Hashes for glum-3.0.0a2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2e9e32444a1d151310971b9c56e73e193e5a2637177f77ac0f442b6980027a9 |
|
MD5 | a36726e9ff6790ee28a2c2b1492ba2be |
|
BLAKE2b-256 | 78d1a43210984c580f26415f5eeb296cfc5b89351e69aa88a4916c84197620ad |
Hashes for glum-3.0.0a2-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fe725f74092a2cbf3e3c4589fd79b69afca00e1d3264a613eaa6dd5143f2f7b |
|
MD5 | a09d96ac82ea08c77594c6f1d2e3c444 |
|
BLAKE2b-256 | 320b05a8db962a0c86197ed6938fccd04551df1dff34527c26c37f00d249868d |
Hashes for glum-3.0.0a2-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf223d55f891e16d0f303f73c9c5c0772ba60457ca0b89808590361434b4a651 |
|
MD5 | f999132fb9829770a16c9f60d22f6ea5 |
|
BLAKE2b-256 | 7e744c2b06a3acd84aa45b4e395d1d52c8ed1b4de1d5ab440c067b40a122698b |
Hashes for glum-3.0.0a2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91e6880504776ea46436eef2035677bdc385eed3c6299619703f02ffb31e002b |
|
MD5 | 0f87d5e6055b4f6f36f3125354fb81e3 |
|
BLAKE2b-256 | c45383fc30ea37ce7bcdb4b4a6a8c7ef6323d04c83bf1a8b08c431caac2fc4aa |
Hashes for glum-3.0.0a2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5508b16d0bd3d5e28c6f37dac9c397560d659ad603fe086044c6f17bc1c47ab |
|
MD5 | 371d7488b62969d546e3ff50e0ac112e |
|
BLAKE2b-256 | 8b1e9caee97d05f7a903d18104fbc1c19528e81a38bf729d0ea1a6b31951bc09 |
Hashes for glum-3.0.0a2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | def857c6c51525eb098b26b7a20be8fc108376367830774f956afa050c5e303b |
|
MD5 | 40cfd620c04d5adf04dfa7f8cf0346d9 |
|
BLAKE2b-256 | 9c0d2a68098b66b993aa6cb5e94ee2d0757e817e3babfe6fdaa4558ae330aeb5 |
Hashes for glum-3.0.0a2-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4134b77c22c5ca5e92c8442efeb26e80e6a7bf581a244c638115140c2750c9f3 |
|
MD5 | aec3dd10e9d2341c7adcfb69adf137ec |
|
BLAKE2b-256 | 2c7b5760d6fb76f0944910a89d5209bf640cf6890c85d6f6fe5b4685270df477 |
Hashes for glum-3.0.0a2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a2aa97b5c458febf1d7fe362e9117a7f90dae8c3a9cd901462b66c58608810a |
|
MD5 | 5d522267dbe41ceea478a20a2487c110 |
|
BLAKE2b-256 | 95af0b06d1060c3245f5d74c4f794c74c495f9cecceefe3340d9cf103575e84d |
Hashes for glum-3.0.0a2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd2d7462b2c9a5a113a273f0bcbc1300271beb3e4f4fa300a9ddddf126a607f0 |
|
MD5 | 87f4f49b2161c76fcde8511945f15cad |
|
BLAKE2b-256 | ba1155d0a982a0800a92eab3fc8e45abb3b867bb4e79f582f14360ba82e8f695 |
Hashes for glum-3.0.0a2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 68ec6f1f15938b67838da22800b3dd66cc424f9fb810ecdbdd36b57f088fecc5 |
|
MD5 | 384bc352b9f32171f1a6129797a0ce25 |
|
BLAKE2b-256 | 1c88e25e4a2a6b09053bdb9731eb856e8c09b6104b11ccf31aec6e52c2174ca9 |