Skip to main content

spark_dummy_tools

Project description

spark_dummy_tools

Github License Updates Python 3 Code coverage

spark_dummy_tools is a Python library that implements for dummy table

Installation

The code is packaged for PyPI, so that the installation consists in running:

pip install spark-dummy-tools --user 

Usage

wrapper take Dummy

from spark_dummy_tools import generated_dummy_table_artifactory
from spark_dummy_tools import generated_dummy_table_datum
import spark_dataframe_tools


Generated Dummy Table Datum
============================================================
path = "fields_pe_datum2.csv"
table_name = "t_kctk_collateralization_atrb"
storage_zone = "master"
sample_parquet = 10
columns_integer_default={}
columns_date_default={"gf_cutoff_date":"2026-01-01"}
columns_string_default={}
columns_decimal_default={"other_concepts_amount":"500.00"}

generated_dummy_table_datum(spark=spark,
                            path=path,
                            table_name=table_name,
                            storage_zone=storage_zone,
                            sample_parquet=sample_parquet,
                            partition_colum=["gf_cutoff_date"],
                            columns_integer_default=columns_integer_default,
                            columns_date_default=columns_date_default,
                            columns_string_default=columns_string_default,
                            columns_decimal_default=columns_decimal_default
                           )
                       



Generated Dummy Table Artifactory
============================================================
path = "lclsupplierspurchases.output.schema"
sample_parquet = 10
columns_integer_default={}
columns_date_default={"gf_cutoff_date":"2026-01-01"}
columns_string_default={}
columns_decimal_default={"other_concepts_amount":"500.00"}


generated_dummy_table_artifactory(spark=spark,
                                  path=path,
                                  sample_parquet=sample_parquet,
                                  columns_integer_default=columns_integer_default,
                                  columns_date_default=columns_date_default,
                                  columns_string_default=columns_string_default,
                                  columns_decimal_default=columns_decimal_default
                                 )










import os, sys
is_windows = sys.platform.startswith('win')
path_directory = os.path.join("DIRECTORY_DUMMY", table_name)
if is_windows:
    path_directory = path_directory.replace("\\", "/")
    

df =  spark.read.parquet(path_directory)
df.show2(10)
  

License

Apache License 2.0.

New features v1.0

BugFix

  • choco install visualcpp-build-tools

Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_dummy_tools-0.4.0.tar.gz (8.4 kB view hashes)

Uploaded Source

Built Distribution

spark_dummy_tools-0.4.0-py3-none-any.whl (8.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page