Skip to main content

Temporian is a library to pre-process temporal signals before their use as input features with off-the-shelf tabular machine learning libraries (e.g., TensorFlow Decision Forests).

Project description

Temporian logo

tests formatting docs

Temporian is a Python package for feature engineering of temporal data, focusing on preventing common modeling errors and providing a simple and powerful API, a first-class iterative development experience, and efficient and well-tested implementations of common and not-so-common temporal data preprocessing functions.

Why Temporian?

Temporian helps you focus on high-level modeling.

Temporal data processing is commonly done with generic data processing tools. However, this approach is often tedious, error-prone and inefficient, and requires engineers to learn and re-implement existing methods. Additionally, the complexity of these tools can lead engineers to create less effective pipelines in order to reduce complexity. This can increase the cost of developing and maintaining performant ML pipelines.

To see the benefit of Temporian over general data processing libraries, compare the original Feature engineering section of our Khipu 2023 Forecasting Tutorial, which uses pandas to preprocess the M5 sales dataset, to the updated version using Temporian.

Installation

Temporian is available on PyPI. To install it, run:

pip install temporian

Minimal end-to-end example

import temporian as tp

# Load data.
evset = tp.read_event_set("path/to/temporal_data.csv", timestamp_column="time")
node = evset.node()

# Apply operators to create a processing graph.
sma = tp.simple_moving_average(node, window_length=tp.days(7))

# Run the graph on the input data.
result = sma.evaluate(evset)

Key features

These are what set Temporian apart.

  • Simple and powerful API: Temporian exports high level operations making processing complex programs short and easy to read.
  • Prevents modeling errors: Temporian programs are guaranteed not to have future leakage unless the user calls the leak function, ensuring that models are not trained on future data.
  • Iterative development: Temporian can be used to develop preprocessing pipelines in Colab or local notebooks, allowing users to visualize results each step of the way to identify and correct errors early on.
  • Efficient and well-tested implementations: Temporian contains efficient and well-tested implementations of a variety of temporal data processing functions. For instance, our implementation of window operators is x2000 faster than the same function implemented with NumPy.
  • Wide range of preprocessing functions: Temporian contains a wide range of preprocessing functions, including moving window operations, lagging, calendar features, arithmetic operations, index manipulation and propagation, resampling, and more. For a full list of the available operators, see the operators documentation.

Getting started

Check out 3 minutes to Temporian for a quick introduction to how Temporian works.

Documentation

The official documentation is available at temporian.readthedocs.io.

Contributing

Contributions to Temporian are welcome! Check out the contributing guide to get started.

Credits

This project is a collaboration between Google and Tryolabs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

temporian-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (403.0 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

temporian-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (402.7 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

temporian-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (403.8 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page