A universal Python client for InfluxDB

These details have not been verified by PyPI

Project links

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

InfluxBC

[InfluxBC as in Influx Buzz Compensator]

A universal Python client for InfluxDB.

This project aims to provide a higher level API and compatibility layer to InfluxData's official Python V1, V2 (and eventually V3) client libraries.

The package provides two (eventually three, see above) client classes (InfluxDBV1Client and InfluxDBV2Client) with a common API to the respective InfluxDB version. With these clients one should be easily writing Python code to interact with one InfluxDB V1 instance over InfluxQL and than update and switch to V2 without needing to implement a whole different client API. See Rationale below for more details.

[[TOC]]

Features

Schema management
- List databases/buckets, measurements, fields, tag keys and values
- Delete data
Data access
- Read data as pandas.DataFrames
- Retrieve first or last points for a specific field
- Poll for incoming data on a specific field
Data ingestion
- Write pandas.DataFrames

Prerequisites

Python 3.8 or later
A running InfluxDB V1 or V2 instance

Installation

Install the influxbc library from the PyPI in your virtual environment:

python3 -m pip install influxbc

Usage

Import and Initialize the Client

from influxbc import InfluxDBV1Client, InfluxDBV2Client

# InfluxDB V1 uses basic authorization
USER = "your_influxdb_user"
PASSWORD = "your_influxdb_password"
DATABASE = "your_database"
v1_client = InfluxDBV1Client(
  host="localhost",
  port=8086,
  username=USER,
  password=PASSWORD,
  database=DATABASE,
)

# InfluxDB V2 uses token authorization
TOKEN = "your_influxdb_token"
BUCKET = "your_bucket"
v2_client = InfluxDBV2Client(
  url="http://localhost:8086",
  token=TOKEN,
  bucket_name=BUCKET
)

After the initialization one can use the API using the same methods for v1 and v2 client (we will name them client for simplicity).

Manage Schema

List databases/buckets

client.get_databases()
# ['your_database']

List measurements

client.get_measurements()
# ['cpu', 'mem', 'disk']

List fields keys

client.get_fields(measurement="cpu")
# ['usage_guest', 'usage_nice', 'usage_user', 'usage_system', ]

Omitting measurement will give all field keys in the database/bucket.

List tag keys

client.get_tag_keys(measurement="cpu")
# ['location', 'host']

Omitting measurement will give all tag keys in the database/bucket.

List tag values

client.get_tag_values(measurement="cpu", tag_key="host")
# ['node1.us.example.com', 'node2.us.example.com']

Omitting measurement will give all tag values for the key in the database/bucket.

Delete points from a database

One can delete data from a database/bucket.

client.delete(measurement="my_measurement", tags={"sensor_id": "123"})

Omitting measurement will delete data from all measurements in the database/bucket. Omitting tags will delete all data from the given measurement.

However, to prevent accidental wiping of databases/buckets, attempting so will raise a DeleteError:

client.delete()
# DeleteError(...)

Access Data

Read data into a pandas DataFrame

client.read_data_frame(
    measurement="my_measurement",
    start="2023-02-22T00:00:00Z",
    stop="2023-02-23T00:00:00Z",
)
#                            bar  baz  foo
# 2022-12-21 00:00:00+00:00    3    7    1
# 2022-12-22 00:00:00+00:00    4    8    2

Get the first point in a series

client.read_first_point(
    measurement="my_measurement",
    field="foo",
    start="2023-02-22T00:00:00Z",
)
# _time
# 2022-12-21 00:00:00+00:00    1
# Name: foo, dtype: int64

Get the last point in a series

last_point = client.read_last_point(
    measurement="my_measurement",
    field="foo",
    start="2023-02-22T00:00:00Z",
)
# _time
# 2022-12-22 00:00:00+00:00    2
# Name: foo, dtype: int64

Poll for data that you expect inside a field

You can poll for data in a field while specifying the polling interval sleep time and a timeout to prevent infinite blocking.

first_point = client.poll_field(
    measurement="my_measurement",
    field="foo",
    start="2023-02-22T00:00:00Z",
    return_first=True,
    sleep_time_ms=60_000,
    timeout=datetime.timedelta(seconds=10),
)

Or you can set a validity period to accept earlier timestamps when you cannot exactly predict the date-time the data will come.

first_point = client.poll_field(
    measurement="my_measurement",
    field="temperature",
    start="2023-02-22T00:00:00Z",
    return_first=True,
    validity_period=datetime.timedelta(seconds=1),
)

Write Data

import pandas as pd
data = pd.DataFrame({
    "time": pd.to_datetime(["2023-02-23T15:00:00Z"]),
    "value": [10.5],
})
client.write_data_frame(data, measurement="my_measurement", tags={"sensor_id": "123"})

Logging

InfluxBC registers a logger with the ID influxbc with the Python Standard Library logging facility.

As a default, the logger is set to the WARNING logging level. You can set the level in your code like so:

import logging

from influxbc.logging import logger as influxbc_logger

influxbc_logger.setLevel(logging.DEBUG)

Alternatively, you can set the IBC_LOG_LEVEL environment variable to control InfluxBC's logging level.

Getting the Messages

While the official V1 client library does not seem to provide any logging interface at all, messages from InfluxBC's logger is all you can expect.

However, being a lot more complex, the official V2 client library provides several logger instances which can be used to see what's going on inside the client. To see the HTTP requests sent to the InfluxDB V2 instance for example, you can get the logger with ID influxdb_client.client.http and set it to the DEBUG level.

Concepts

Missing data as a result of missing resources (e.g. a missing measurement, a failing filter criterion, etc.) will not raise an error but return empty data instead. This goes along best practices for HTTP APIs, which is the interface form of InfluxDB instances.
A client instance is always bound and connected to a database/bucket which has to be given upon client initialization (this is considered common sense in database clients).

Rationale

The core structural concept of InfluxDB has been more or less the same acros all major InfluxDB versions. However unfortunately, implementations such as query language, client libraries as well as concept nomenclature have not.

Following is an overview of consistencies and inconsistencies among InfluxDB major version. Please note that we're completely ignoring potential benefits which come from using or upgrading to a higher major version of InfluxDB, such as say performance aspects.

	V1	V2	V3
Database	Database	Bucket	Database
Measurement	Measurement	Measurement	Measurement
Field	Field	Field	Field
Tag	Tag	Tag	Tag
Series	Series	Series	Series
Point	Point	Point	Point
Organization	-	Organization	Organization
Query Language(s)	InfluxQL (Flux via `/api/v2/query`)	Flux (InfluxQL via `/query`)	InfluxQL, SQL
Write Protocol	Line Protocol	Line Protocol	Line Protocol, (tba)
Authentication	Basic, JWT	Token, Basic (UI)	Token
Retention	Retention Policy	Bucket ("DBRP mappings"^dbrp)	(tba)
Inclusive Range	Arbitrary (InfluxQL)	Flux' `range` is half-open	Arbitrary (InfluxQL, SQL)
Continuous Processing	Continuous Queries	Tasks	(tba)
Frontend	Chronograf^chronograf	Built-in	(tba)
Python Client	`influxdb-python`^v1 (archived)	`influxdb-client-python`^v2	`influxdb3-python`^v3
State of development[^sod]	Maintenance	Active	Prototype development

[^sod]: March 2024 (https://github.com/influxdata/influxdb)

Development

To set up a development environment you need a virtual environment and Poetry, for example:

POETRY_VIRTUALENVS_IN_PROJECT=1 poetry install

Testing

Tests for influxbc are written with pytest. You can run the test suite with

pytest tests

Formatting

We use Black as our code formatter.

black --preview .

Contributing

Before requesting your contribution to be merged, please make sure that all tests check out and that your code is properly formatted.

pre-commit

Please install and use pre-commit in your development environment to simple issues up-front before committing your contribution.

Going through the "Quick start" guide will install pre-commit git hooks that run automatically when you commit.

License

This project is licensed under the terms of the MIT license.

Project details

These details have not been verified by PyPI

Project links

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.3

Apr 26, 2024

0.1.2

Mar 8, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

influxbc-0.1.3.tar.gz (18.1 kB view hashes)

Uploaded Apr 26, 2024 Source

Built Distribution

influxbc-0.1.3-py3-none-any.whl (19.6 kB view hashes)

Uploaded Apr 26, 2024 Python 3

Hashes for influxbc-0.1.3.tar.gz

Hashes for influxbc-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`afcc6b8ebc80ec035cab2280f64282377217444dc6a2fddd0b1c73c515f1a2f5`
MD5	`3fe21e8bc6313d2318bde40bf9f9d1ed`
BLAKE2b-256	`d831f129ab0fffedf4e5046e882883b1c63b95af0e7395c8edd8e8696032ea35`

Hashes for influxbc-0.1.3-py3-none-any.whl

Hashes for influxbc-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6c1d90a8d038475bed16a801a2b363aeac7b18d22b2b4d8df6414805b1cb1e3f`
MD5	`499e8faa6c6864e2ea55682cef3ed1ba`
BLAKE2b-256	`b0508aef1898269aa7eb880abea1ce6fb8d2301674e5eb734a6c9fa3eaebba8e`