kafka_influxdb

A Kafka consumer for InfluxDB

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

A Kafka consumer for InfluxDB written in Python.
All messages sent to Kafka on a certain topic will be relayed to Influxdb.
Supports InfluxDB 0.9.x. For InfluxDB 0.8.x support check out the 0.3.0 tag.

Quick start

To see the tool in action, you can start a complete CollectD -> Kafka -> kafka_influxdb -> Influxdb setup with the following command:

docker-compose up

This will immediately start reading messages from Kafka and write them into InfluxDB. Open the InfluxDB Admin Interface at http://<docker_host_ip>:8083 and type SHOW MEASUREMENTS to see the output. (<docker_host_ip> is probably localhost on Linux. On Mac you can find out with boot2docker ip or docker-machine ip).

Run on your local machine

If you want to run a local instance, you can do so with the following commands:

pip install kafka_influxdb
# Create a config.yaml like the one in this repository
kafka_influxdb -c config.yaml

Benchmark

You can use the built-in benchmark tool for performance measurements:

python kafka-influxdb.py -b

By default this will write 1.000.000 sample messages into the benchmark Kafka topic. After that it will consume the messages again to measure the throughput. Sample output using the above Docker setup inside a virtual machine:

Flushing output buffer. 10811.29 messages/s
Flushing output buffer. 11375.65 messages/s
Flushing output buffer. 11930.45 messages/s
Flushing output buffer. 11970.28 messages/s
Flushing output buffer. 11016.74 messages/s
Flushing output buffer. 11852.67 messages/s
Flushing output buffer. 11833.07 messages/s
Flushing output buffer. 11599.32 messages/s
Flushing output buffer. 11800.12 messages/s
Flushing output buffer. 12026.89 messages/s
Flushing output buffer. 12287.26 messages/s
Flushing output buffer. 11538.44 messages/s

Supported formats

You can write a custom encoder to support any input and output format (even fancy things like Protobuf).

Look at the examples in the encoder folder to get started. For now we support the following formats:

Input formats

Collectd Graphite, e.g. “mydatacenter.myhost.load.load.shortterm 0.45 1436357630”

Output formats

InfluxDB 0.9 line protocol output (e.g. load_load_shortterm,datacenter=mydatacenter,host=myhost value="0.45" 1436357630)
InfluxDB 0.8 JSON output (deprecated)

Configuration

Take a look at the config.yaml to find out how to create a config file.

You can overwrite the settings from the commandline with the following flags:

usage: kafka_influxdb.py [-h] [--kafka_host KAFKA_HOST]
                         [--kafka_port KAFKA_PORT] [--kafka_topic KAFKA_TOPIC]
                         [--kafka_group KAFKA_GROUP]
                         [--influxdb_host INFLUXDB_HOST]
                         [--influxdb_port INFLUXDB_PORT]
                         [--influxdb_user INFLUXDB_USER]
                         [--influxdb_password INFLUXDB_PASSWORD]
                         [--influxdb_dbname INFLUXDB_DBNAME]
                         [--influxdb_retention_policy INFLUXDB_RETENTION_POLICY]
                         [--influxdb_time_precision INFLUXDB_TIME_PRECISION]
                         [--encoder ENCODER] [--buffer_size BUFFER_SIZE]
                         [-c CONFIGFILE] [-s] [-b] [-v]

optional arguments:
  -h, --help            show this help message and exit
  --kafka_host KAFKA_HOST
                        Hostname or IP of Kafka message broker (default:
                        localhost)
  --kafka_port KAFKA_PORT
                        Port of Kafka message broker (default: 9092)
  --kafka_topic KAFKA_TOPIC
                        Topic for metrics (default: my_topic)
  --kafka_group KAFKA_GROUP
                        Kafka consumer group (default: my_group)
  --influxdb_host INFLUXDB_HOST
                        InfluxDB hostname or IP (default: localhost)
  --influxdb_port INFLUXDB_PORT
                        InfluxDB API port (default: 8086)
  --influxdb_user INFLUXDB_USER
                        InfluxDB username (default: root)
  --influxdb_password INFLUXDB_PASSWORD
                        InfluxDB password (default: root)
  --influxdb_dbname INFLUXDB_DBNAME
                        InfluXDB database to write metrics into (default:
                        metrics)
  --influxdb_retention_policy INFLUXDB_RETENTION_POLICY
                        Retention policy for incoming metrics (default:
                        default)
  --influxdb_time_precision INFLUXDB_TIME_PRECISION
                        Precision of incoming metrics. Can be one of 's', 'm',
                        'ms', 'u' (default: s)
  --encoder ENCODER     Input encoder which converts an incoming message to
                        dictionary (default: collectd_graphite_encoder)
  --buffer_size BUFFER_SIZE
                        Maximum number of messages that will be collected
                        before flushing to the backend (default: 1000)
  -c CONFIGFILE, --configfile CONFIGFILE
                        Configfile path (default: None)
  -s, --statistics      Show performance statistics (default: True)
  -b, --benchmark       Run benchmark (default: False)
  -v, --verbose         Set verbosity level. Increase verbosity by adding a v:
                        -v -vv -vvv (default: 0)

TODO

flush buffer if not full but some time has gone by (safety net for low frequency input)
Support reading from multiple partitions and topics (using Python multiprocessing)
Enable settings using environment variables for Docker image

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.9.3

Jul 7, 2017

0.9.2

Jul 7, 2017

0.9.1

May 17, 2017

0.9.0

May 17, 2017

0.8.4

Apr 27, 2017

0.8.3

Apr 27, 2017

0.8.1

Apr 27, 2017

0.8.0

Nov 30, 2016

0.7.3

Oct 14, 2016

0.7.2

Oct 30, 2015

0.7.1

Oct 20, 2015

0.7.0

Oct 20, 2015

0.6.0

Oct 7, 2015

This version

0.5.0

Jul 14, 2015

0.4.1

Jul 14, 2015

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kafka_influxdb-0.5.0.tar.gz (10.9 kB view hashes)

Uploaded Jul 14, 2015 Source

Hashes for kafka_influxdb-0.5.0.tar.gz

Hashes for kafka_influxdb-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`2631f76842b2ac09206653db50acfb8e1b880e913263d64b309b3c3d424ec7e3`
MD5	`e3ef93f2a94720ecf6dbf1bc6c04dd32`
BLAKE2b-256	`67daabda78e564eb26adc6b43832c6cea1ec71d554d12e0b0d6bcd1266ee8cd8`