Skip to main content

A collection of algorithms for fitting the baseline of experimental data.

Project description

Most Recent Version Documentation Status Supported Python versions BSD 3-clause license

pybaselines is a collection of baseline algorithms for fitting experimental data.

Introduction

pybaselines provides many different algorithms for fitting baselines to data from experimental techniques such as Raman, FTIR, NMR, XRD, PIXE, etc. The aim of the project is to provide a semi-unified API to allow quickly testing and comparing multiple baseline algorithms to find the best one for a set of data.

pybaselines has 25+ baseline algorithms. Baseline fitting techniques are grouped accordingly (note: when a method is labelled as ‘improved’, that is the method’s name, not editorialization):

  1. Polynomial (pybaselines.polynomial)

    1. poly (Regular Polynomial)

    2. modpoly (Modified Polynomial)

    3. imodpoly (Improved Modified Polynomial)

    4. penalized_poly (Penalized Polynomial)

    5. loess (Locally Estimated Scatterplot Smoothing)

  2. Whittaker-smoothing-based techniques (pybaselines.whittaker)

    1. asls (Asymmetric Least Squares)

    2. iasls (Improved Asymmetric Least Squares)

    3. airpls (Adaptive Iteratively Reweighted Penalized Least Squares)

    4. arpls (Asymmetrically Reweighted Penalized Least Squares)

    5. drpls (Doubly Reweighted Penalized Least Squares)

    6. iarpls (Improved Asymmetrically Reweighted Penalized Least Squares)

    7. aspls (Adaptive Smoothness Penalized Least Squares)

    8. psalsa (Peaked Signal’s Asymmetric Least Squares Algorithm)

  3. Morphological (pybaselines.morphological)

    1. mpls (Morphological Penalized Least Squares)

    2. mor (Morphological)

    3. imor (Improved Morphological)

    4. mormol (Morphological and Mollified Baseline)

    5. amormol (Averaging Morphological and Mollified Baseline)

    6. rolling_ball (Rolling Ball Baseline)

  4. Window-based (pybaselines.window)

    1. noise_median (Noise Median method)

    2. snip (Statistics-sensitive Non-linear Iterative Peak-clipping)

    3. swima (Small-Window Moving Average)

  5. Optimizers (pybaselines.optimizers)

    1. collab_pls (Collaborative Penalized Least Squares)

    2. optimize_extended_range

    3. adaptive_minmax (Adaptive MinMax)

  6. Manual methods (pybaselines.manual)

    1. linear_interp (Linear interpolation between points)

Installation

Dependencies

pybaselines requires Python version 3.6 or later and the following libraries:

All of the required libraries should be automatically installed when installing pybaselines using either of the two installation methods below.

Stable Release

pybaselines is easily installed using pip, simply by running the following command in your terminal:

pip install --upgrade pybaselines

This is the preferred method to install pybaselines, as it will always install the most recent stable release.

Development Version

The sources for pybaselines can be downloaded from the Github repo.

The public repository can be cloned using:

git clone https://github.com/derb12/pybaselines.git

Once the repository is downloaded, it can be installed with:

cd pybaselines
python setup.py install

Quick Start

To use the various functions in pybaselines, simply input the measured data and any required parameters. All baseline functions in pybaselines will output two items: a numpy array of the calculated baseline and a dictionary of parameters that can be helpful for reusing the functions.

For more details on each baseline algorithm, refer to the algorithms section of pybaselines’s documentation.

A simple example is shown below.

import pybaselines
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(100, 4200, 1000)
# a measured signal containing several Gaussian peaks
signal = (
    pybaselines.utils.gaussian(x, 2, 700, 50)
    + pybaselines.utils.gaussian(x, 3, 1200, 150)
    + pybaselines.utils.gaussian(x, 5, 1600, 100)
    + pybaselines.utils.gaussian(x, 4, 2500, 50)
    + pybaselines.utils.gaussian(x, 7, 3300, 100)
)
# baseline is a polynomial plus a broad gaussian
true_baseline = (
    10 + 0.001 * x
    + pybaselines.utils.gaussian(x, 6, 2000, 2000)
)
noise = np.random.default_rng(1).normal(0, 0.2, x.size)

y = signal + true_baseline + noise

bkg_1 = pybaselines.polynomial.modpoly(y, x, poly_order=3)[0]
bkg_2 = pybaselines.whittaker.asls(y, lam=1e7, p=0.01)[0]
bkg_3 = pybaselines.morphological.imor(y, half_window=25)[0]
bkg_4 = pybaselines.window.snip(
    y, max_half_window=40, decreasing=True, smooth_half_window=1
)[0]

plt.plot(x, y, label='raw data', lw=1.5)
plt.plot(x, true_baseline, lw=3, label='true baseline')
plt.plot(x, bkg_1, '--', label='modpoly')
plt.plot(x, bkg_2, '--', label='asls')
plt.plot(x, bkg_3, '--', label='imor')
plt.plot(x, bkg_4, '--', label='snip')

plt.legend()
plt.show()

The above code will produce the image shown below.

various baselines

Contributing

Contributions are welcomed and greatly appreciated. For information on submitting bug reports, pull requests, or general feedback, please refer to the contributing guide.

Changelog

Refer to the changelog for information on pybaselines’s changes.

License

pybaselines is open source and freely available under the BSD 3-clause license. For more information, refer to the license.

Author

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybaselines-0.3.0.tar.gz (37.2 kB view hashes)

Uploaded Source

Built Distribution

pybaselines-0.3.0-py3-none-any.whl (39.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page