Skip to main content

No project description provided

Project description

dlt-personio-source

Parent tables

'employees', 
'absences', 
'absence_types', 
'attendances'

some of these tables have sub-tables

to join the parent table to the sub table, use the join parent.dlt_id = child.parent_dlt_id

Usage

install library

pipx install dlt-personio-source if the library cannot be found, ensure you have the required python version as per the pyproject.tomlfile.

Run the source as below to load a sample data set.

Add credentials and remove the dummy_data flag to enable loading your data.

First, import the loading method and add your credentials

from dlt_personio_source import load_personio_tables

#target credentials
# example for bigquery
creds = {
  "type": "service_account",
  "project_id": "zinc-mantra-353207",
  "private_key_id": "example",
  "private_key": "",
  "client_email": "example@zinc-mantra-353207.iam.gserviceaccount.com",
  "client_id": "100909481823688180493"}
  
# or example for redshift:
# creds = ["redshift", "database_name", "schema_name", "user_name", "host", "password"]

#Personio credentials
#get credentials at this url - replace"test-1" with your org name
#https://test-1.personio.de/configuration/api/credentials/management
client_id = ''
client_secret = ''

then, you can use the code below to do a serial load:

# remove some tables from this list of you only want some endpoints
tables = ['employees', 'absences', 'absence_types', 'attendances']
load_personio_tables(client_id=client_id,
                     client_secret=client_secret,
                     target_credentials=creds,
                     tables=tables,
                     schema_name='personio_raw',
                     dummy_data=True)

or, for parallel load, create airflow tasks for each table like so:

tables = ['employees', 'absences', 'absence_types', 'attendances']
for table in tables:
    load_personio_tables(client_id='',
                         client_secret='',
                         target_credentials=creds,
                         tables = [table],
                         schema_name='personio_raw',
                         dummy_data = True)

If you want to do your own pipeline or consume the source differently:

from dlt_personio_source import PersonioSource, PersonioSourceDummy

prod = PersonioSource(client_id='',
              client_secret='')
              
dummy = PersonioSourceDummy()

sample_data = dummy.tasks() 

for task in tasks:
    print(task['table_name'])
    for row in task['data']
        print(row)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dlt-personio-source-0.0.46.tar.gz (39.6 kB view hashes)

Uploaded Source

Built Distribution

dlt_personio_source-0.0.46-py3-none-any.whl (42.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page