Project description

dlt-personio-source

Parent tables

'employees', 
'absences', 
'absence_types', 
'attendances'

some of these tables have sub-tables

to join the parent table to the sub table, use the join parent.dlt_id = child.parent_dlt_id

Usage

install library

pipx install dlt-personio-source if the library cannot be found, ensure you have the required python version as per the pyproject.tomlfile.

Run the source as below to load a sample data set.

Add credentials and remove the dummy_data flag to enable loading your data.

First, import the loading method and add your credentials

from dlt_personio_source import load_personio_tables

#target credentials
# example for bigquery
creds = {
  "type": "service_account",
  "project_id": "zinc-mantra-353207",
  "private_key_id": "example",
  "private_key": "",
  "client_email": "example@zinc-mantra-353207.iam.gserviceaccount.com",
  "client_id": "100909481823688180493"}
  
# or example for redshift:
# creds = ["redshift", "database_name", "schema_name", "user_name", "host", "password"]

#Personio credentials
#get credentials at this url - replace"test-1" with your org name
#https://test-1.personio.de/configuration/api/credentials/management
client_id = ''
client_secret = ''

then, you can use the code below to do a serial load:

# remove some tables from this list of you only want some endpoints
tables = ['employees', 'absences', 'absence_types', 'attendances']
load_personio_tables(client_id=client_id,
                     client_secret=client_secret,
                     target_credentials=creds,
                     tables=tables,
                     schema_name='personio_raw',
                     dummy_data=True)

or, for parallel load, create airflow tasks for each table like so:

tables = ['employees', 'absences', 'absence_types', 'attendances']
for table in tables:
    load_personio_tables(client_id='',
                         client_secret='',
                         target_credentials=creds,
                         tables = [table],
                         schema_name='personio_raw',
                         dummy_data = True)

If you want to do your own pipeline or consume the source differently:

from dlt_personio_source import PersonioSource, PersonioSourceDummy

prod = PersonioSource(client_id='',
              client_secret='')
              
dummy = PersonioSourceDummy()

sample_data = dummy.tasks() 

for task in tasks:
    print(task['table_name'])
    for row in task['data']
        print(row)

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.46

Jul 19, 2022

0.0.45

Jul 19, 2022

0.0.44

Jul 19, 2022

0.0.43

Jul 19, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dlt-personio-source-0.0.46.tar.gz (39.6 kB view hashes)

Uploaded Jul 19, 2022 Source

Built Distribution

dlt_personio_source-0.0.46-py3-none-any.whl (42.0 kB view hashes)

Uploaded Jul 19, 2022 Python 3

Hashes for dlt-personio-source-0.0.46.tar.gz

Hashes for dlt-personio-source-0.0.46.tar.gz
Algorithm	Hash digest
SHA256	`6d8f6869c74ab9ab9e9fddc51d6a0119cedbc9b48d3de3200271a178dadda6b2`
MD5	`f846d639887a8b44128367b284204826`
BLAKE2b-256	`85fa795e3236ac7c070724165e875c01c85c84f6dc597452c7e62c432ac5bef1`

Hashes for dlt_personio_source-0.0.46-py3-none-any.whl

Hashes for dlt_personio_source-0.0.46-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d41bbf34c5c08591767c66ee38dc92d29d6aeb67186a08898cbef5b26281bd8b`
MD5	`58cb50a823d3edccca5ee02c19b6944f`
BLAKE2b-256	`dd123fd6fa6cbd0a092a8927ad2c95edf793f7163bcae99611dc9f5e139e6995`