lakehouse.databricks.cdm.process_cdm_table_bronze

lakehouse.databricks.cdm.process_cdm_table_bronze(spark, cdm_path, entity_path, trickle_cdm_path, trickle_entity_path, dest_mnt, entity, source, subject_area, table, execution_id, full_flag=False)

This fucntion moves CDM Data from the Common Data Model Storage account to the Data Lakehouse Bronze Zone in the form of append only delta tables

Parameters

sparkspark context
spark context passed from the calling spark instance

cdm_pathstring
path to the CDM manifest file

entity_pathstring
full path to the CDM Data

trickle_cdm_pathstring
full path to the CDM Trickle Data

trickle_entity_pathstring
full path to the CDM Trickle manifest file

dest_mntstring
mounting point where the data will be stored in the Data Lakehouse

entitystring
name of the entity/table to create in the Data Lakehouse

sourcestring
name of the system generating the source data

subject_areastring
name of the subject area within the source system

tablestring
name of the table to create - not implemented

execution_idstring
exection_id of the Azure Data Factory Pipeline to tie back to the ADF execution

full_flagboolean
overrides delta load logic

Raises:: Exception : bubbles up errors to the calling notebook or application

Returns

json :
returns the statistics associated with the Streaming of the bronze CDM data to the Silver zone of the Data Lakehouse, including, Expectation Statistics, and delta load statistics