Project description

udata-analysis-service

This service's purpose is to analyse udata datalake files to enrich the metadata, starting with CSVs. It uses csv-detective to detect the type and format of CSV columns by checking both headers and contents.

Installation

Install udata-analysis-service:

pip install udata-analysis-service

Rename the .env.sample to .env and fill it with the right values.

REDIS_URL = redis://localhost:6381/0
REDIS_HOST = localhost
REDIS_PORT = 6381
KAFKA_HOST = localhost
KAFKA_PORT = 9092
KAFKA_API_VERSION = 2.5.0
MINIO_URL = https://object.local.dev/
MINIO_USER = sample_user
MINIO_PWD = sample_pwd
ROWS_TO_ANALYSE_PER_FILE=500
CSV_DETECTIVE_REPORT_BUCKET = benchmark-de
CSV_DETECTIVE_REPORT_FOLDER = report
TABLESCHEMA_BUCKET = benchmark-de
TABLESCHEMA_FOLDER = schemas
UDATA_INSTANCE_NAME=udata

Usage

Start the Kafka consumer:

udata-analysis-service consume

Start the Celery worker:

udata-analysis-service work

Project details

These details have not been verified by PyPI

Project links

Home

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.1.dev53 pre-release

Aug 26, 2022

0.0.1.dev38 pre-release

Aug 1, 2022

0.0.1.dev34 pre-release

Jul 28, 2022

0.0.1.dev27 pre-release

Jul 18, 2022

0.0.1.dev24 pre-release

Jul 13, 2022

0.0.1.dev12 pre-release

Jul 1, 2022

This version

0.0.1.dev6 pre-release

Jun 24, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

udata-analysis-service-0.0.1.dev6.tar.gz (5.7 kB view hashes)

Uploaded Jun 24, 2022 Source

Built Distribution

udata_analysis_service-0.0.1.dev6-py2.py3-none-any.whl (5.4 kB view hashes)

Uploaded Jun 24, 2022 Python 2 Python 3

Hashes for udata-analysis-service-0.0.1.dev6.tar.gz

Hashes for udata-analysis-service-0.0.1.dev6.tar.gz
Algorithm	Hash digest
SHA256	`3f11dbc0ee7902bde7f31b17854cb2e1fcb722b0875c952798411f685eddc929`
MD5	`ea27434cf636dcffcbb3ea6f1bcea1c9`
BLAKE2b-256	`6d6dece5e3d175460af67b097bebea9b1c9083bbf1a6cc6ac97dc89f67d3ea60`

Hashes for udata_analysis_service-0.0.1.dev6-py2.py3-none-any.whl

Hashes for udata_analysis_service-0.0.1.dev6-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`7f1363fe4d23d7b493aa64b666e0a38107d827e48e661f43d7202f2d6f6140ea`
MD5	`4a8745ecae87790039d8b5b3bbe984fe`
BLAKE2b-256	`596f7c85abef42fd481f1496e9c8dca9e3eb11b5a67b00cf74e773356bc8597b`