lakehouse.databricks.cdm.process_cdm_table_bronze
- lakehouse.databricks.cdm.process_cdm_table_bronze(spark, cdm_path, entity_path, trickle_cdm_path, trickle_entity_path, dest_mnt, entity, source, subject_area, table, execution_id, full_flag=False)
This fucntion moves CDM Data from the Common Data Model Storage account to the Data Lakehouse Bronze Zone in the form of append only delta tables
Parameters
- sparkspark context
spark context passed from the calling spark instance
- cdm_pathstring
path to the CDM manifest file
- entity_pathstring
full path to the CDM Data
- trickle_cdm_pathstring
full path to the CDM Trickle Data
- trickle_entity_pathstring
full path to the CDM Trickle manifest file
- dest_mntstring
mounting point where the data will be stored in the Data Lakehouse
- entitystring
name of the entity/table to create in the Data Lakehouse
- sourcestring
name of the system generating the source data
- subject_areastring
name of the subject area within the source system
- tablestring
name of the table to create - not implemented
- execution_idstring
exection_id of the Azure Data Factory Pipeline to tie back to the ADF execution
- full_flagboolean
overrides delta load logic
- Raises:
Exception : bubbles up errors to the calling notebook or application
Returns
- json :
returns the statistics associated with the Streaming of the bronze CDM data to the Silver zone of the Data Lakehouse, including, Expectation Statistics, and delta load statistics