Skip to main content

A universal Python client for InfluxDB

Project description

InfluxBC

[InfluxBC as in Influx Buzz Compensator]

A universal Python client for InfluxDB.

This project aims to provide a higher level API and compatibility layer to InfluxData's official Python V1, V2 (and eventually V3) client libraries.

The package provides two (eventually three, see above) client classes (InfluxDBV1Client and InfluxDBV2Client) with a common API to the respective InfluxDB version. With these clients one should be easily writing Python code to interact with one InfluxDB V1 instance over InfluxQL and than update and switch to V2 without needing to implement a whole different client API. See Rationale below for more details.

Contents

[[TOC]]

Features

  • Schema management
    • List databases/buckets, measurements, fields, tag keys and values
    • Delete data
  • Data access
    • Read data as pandas.DataFrames
    • Retrieve first or last points for a specific field
    • Poll for incoming data on a specific field
  • Data ingestion
    • Write pandas.DataFrames

Prerequisites

  • Python 3.8 or later
  • A running InfluxDB V1 or V2 instance

Installation

Install the influxbc library from the PyPI in your virtual environment:

python3 -m pip install influxbc

Usage

Import and Initialize the Client

from influxbc import InfluxDBV1Client, InfluxDBV2Client

# InfluxDB V1 uses basic authorization
USER = "your_influxdb_user"
PASSWORD = "your_influxdb_password"
DATABASE = "your_database"
v1_client = InfluxDBV1Client(
  host="localhost",
  port=8086,
  username=USER,
  password=PASSWORD,
  database=DATABASE,
)

# InfluxDB V2 uses token authorization
TOKEN = "your_influxdb_token"
BUCKET = "your_bucket"
v2_client = InfluxDBV2Client(
  url="http://localhost:8086",
  token=TOKEN,
  bucket_name=BUCKET
)

After the initialization one can use the API using the same methods for v1 and v2 client (we will name them client for simplicity).

Manage Schema

List databases/buckets

client.get_databases()
# ['your_database']

List measurements

client.get_measurements()
# ['cpu', 'mem', 'disk']

List fields keys

client.get_fields(measurement="cpu")
# ['usage_guest', 'usage_nice', 'usage_user', 'usage_system', ]

Omitting measurement will give all field keys in the database/bucket.

List tag keys

client.get_tag_keys(measurement="cpu")
# ['location', 'host']

Omitting measurement will give all tag keys in the database/bucket.

List tag values

client.get_tag_values(measurement="cpu", tag_key="host")
# ['node1.us.example.com', 'node2.us.example.com']

Omitting measurement will give all tag values for the key in the database/bucket.

Delete points from a database

One can delete data from a database/bucket.

client.delete(measurement="my_measurement", tags={"sensor_id": "123"})

Omitting measurement will delete data from all measurements in the database/bucket. Omitting tags will delete all data from the given measurement.

However, to prevent accidental wiping of databases/buckets, attempting so will raise a DeleteError:

client.delete()
# DeleteError(...)

Access Data

Read data into a pandas DataFrame

client.read_data_frame(
    measurement="my_measurement",
    start="2023-02-22T00:00:00Z",
    stop="2023-02-23T00:00:00Z",
)
#                            bar  baz  foo
# 2022-12-21 00:00:00+00:00    3    7    1
# 2022-12-22 00:00:00+00:00    4    8    2

Get the first point in a series

client.read_first_point(
    measurement="my_measurement",
    field="foo",
    start="2023-02-22T00:00:00Z",
)
# _time
# 2022-12-21 00:00:00+00:00    1
# Name: foo, dtype: int64

Get the last point in a series

last_point = client.read_last_point(
    measurement="my_measurement",
    field="foo",
    start="2023-02-22T00:00:00Z",
)
# _time
# 2022-12-22 00:00:00+00:00    2
# Name: foo, dtype: int64

Poll for data that you expect inside a field

You can poll for data in a field while specifying the polling interval sleep time and a timeout to prevent infinite blocking.

first_point = client.poll_field(
    measurement="my_measurement",
    field="foo",
    start="2023-02-22T00:00:00Z",
    return_first=True,
    sleep_time_ms=60_000,
    timeout=datetime.timedelta(seconds=10),
)

Or you can set a validity period to accept earlier timestamps when you cannot exactly predict the date-time the data will come.

first_point = client.poll_field(
    measurement="my_measurement",
    field="temperature",
    start="2023-02-22T00:00:00Z",
    return_first=True,
    validity_period=datetime.timedelta(seconds=1),
)

Write Data

import pandas as pd
data = pd.DataFrame({
    "time": pd.to_datetime(["2023-02-23T15:00:00Z"]),
    "value": [10.5],
})
client.write_data_frame(data, measurement="my_measurement", tags={"sensor_id": "123"})

Logging

InfluxBC registers a logger with the ID influxbc with the Python Standard Library logging facility.

As a default, the logger is set to the WARNING logging level. You can set the level in your code like so:

import logging

from influxbc.logging import logger as influxbc_logger

influxbc_logger.setLevel(logging.DEBUG)

Alternatively, you can set the IBC_LOG_LEVEL environment variable to control InfluxBC's logging level.

Getting the Messages

While the official V1 client library does not seem to provide any logging interface at all, messages from InfluxBC's logger is all you can expect.

However, being a lot more complex, the official V2 client library provides several logger instances which can be used to see what's going on inside the client. To see the HTTP requests sent to the InfluxDB V2 instance for example, you can get the logger with ID influxdb_client.client.http and set it to the DEBUG level.

Concepts

  • Missing data as a result of missing resources (e.g. a missing measurement, a failing filter criterion, etc.) will not raise an error but return empty data instead. This goes along best practices for HTTP APIs, which is the interface form of InfluxDB instances.
  • A client instance is always bound and connected to a database/bucket which has to be given upon client initialization (this is considered common sense in database clients).

Rationale

The core structural concept of InfluxDB has been more or less the same acros all major InfluxDB versions. However unfortunately, implementations such as query language, client libraries as well as concept nomenclature have not.

Following is an overview of consistencies and inconsistencies among InfluxDB major version. Please note that we're completely ignoring potential benefits which come from using or upgrading to a higher major version of InfluxDB, such as say performance aspects.

V1 V2 V3
Database Database Bucket Database
Measurement Measurement Measurement Measurement
Field Field Field Field
Tag Tag Tag Tag
Series Series Series Series
Point Point Point Point
Organization - Organization Organization
Query Language(s) InfluxQL (Flux via /api/v2/query) Flux (InfluxQL via /query) InfluxQL, SQL
Write Protocol Line Protocol Line Protocol Line Protocol, (tba)
Authentication Basic, JWT Token, Basic (UI) Token
Retention Retention Policy Bucket ("DBRP mappings"^dbrp) (tba)
Inclusive Range Arbitrary (InfluxQL) Flux' range is half-open Arbitrary (InfluxQL, SQL)
Continuous Processing Continuous Queries Tasks (tba)
Frontend Chronograf^chronograf Built-in (tba)
Python Client influxdb-python^v1 (archived) influxdb-client-python^v2 influxdb3-python^v3
State of development[^sod] Maintenance Active Prototype development

[^sod]: March 2024 (https://github.com/influxdata/influxdb)

Development

To set up a development environment you need a virtual environment and Poetry, for example:

POETRY_VIRTUALENVS_IN_PROJECT=1 poetry install

Testing

Tests for influxbc are written with pytest. You can run the test suite with

pytest tests

Formatting

We use Black as our code formatter.

black --preview .

Contributing

Before requesting your contribution to be merged, please make sure that all tests check out and that your code is properly formatted.

pre-commit

Please install and use pre-commit in your development environment to simple issues up-front before committing your contribution.

Going through the "Quick start" guide will install pre-commit git hooks that run automatically when you commit.

License

This project is licensed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

influxbc-0.1.3.tar.gz (18.1 kB view hashes)

Uploaded Source

Built Distribution

influxbc-0.1.3-py3-none-any.whl (19.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page