Common Module Documentation
The common module provides generic functions to support a Delta Lakehouse implementation
|
This function alters the properties of a Delta table in a database. |
|
This function aggregates a dataframe based on the aggregation definition provided |
|
Creates the anti-merge key based on the 'merge_predicate' provided and logs the function's start and end time using 'post_la_data'. |
|
A function to create a Databricks Unity Catalog within the Databricks Workspace |
|
A function to create a Hive Catalog within the spark workspace |
|
This function is used to create a delta table in the Delta Lakehouse and returns the schema to be used when creating a synapse serverless view |
|
A function to create the folder depth structure for querying nested files based on the lake file structof the data being queried |
|
Create a dictionary of the form {"column1": "source.column1", "column2": "source.column2", ...}, where "column1", "column2", etc. |
|
Creates the merge key based on the 'merge_predicate' provided and logs the function's start and end time using 'post_la_data'. |
|
A function to create a mount point within the Databricks Cluster |
|
A function to create a folder path in the datalake |
|
Creates the Type 2 condition based on the 'type_two_keys' provided and logs the function's start and end time using 'post_la_data'. |
|
A function to recursively delete all files in a folder within a mount point |
|
This function performs merge opertation for delta tables |
|
A function to Check to see if Table has enough version or is old enough to optimize and vacuum |
|
A function to drop all hive tables within a hive database/schema |
|
a function to return the number of rows output rows from a delta table operation |
|
a function to return the statistics from a delta table insert/update/delete operation, will always return the most recent history expluding optimize operations |
|
This function generates expectation meta data to control how the calling pipeline behaves on expectation failure |
|
A function to generate expectation results defined in a YAML definition |
|
generate queries for expectation set |
|
A function to return all Hive tables in the hive database |
|
A function to get the ceiling value for a data element in the data lake in incremental data loading strategies |
|
A function to generate a the partition keys string used when mergeing into a partitioned table |
|
A function to optimize a delta table |
|
A function to read data from the data lake using a hive sql query, the data must be available as a hive table |
|
A function to read a set of files or file in the data lake |
|
A function to repartition a delta table |
|
A function to truncate all of the Hive tables within a hive database/schema |
|
This function is used to replace none support sql types when generating a schema for sql serverless views |
|
A function to truncate all of the Hive tables within a hive database/schema |
|
A function to vacuum and optimize all delta tables in a specific hive database/schema |
|
A function to vacumm a delta table |
|
This function evaluates input string and applies jinga macros found in the macro_functions.j2 file in the config/04-macros folder in the data lake |
|
The function is to write files in datalake |