lakehouse.databricks.cdm.process_cdm_table_bronze

lakehouse.databricks.cdm.process_cdm_table_bronze(spark, cdm_path, entity_path, trickle_cdm_path, trickle_entity_path, dest_mnt, entity, source, subject_area, table, execution_id, full_flag=False)

This fucntion moves CDM Data from the Common Data Model Storage account to the Data Lakehouse Bronze Zone in the form of append only delta tables

Parameters

sparkspark context

spark context passed from the calling spark instance

cdm_pathstring

path to the CDM manifest file

entity_pathstring

full path to the CDM Data

trickle_cdm_pathstring

full path to the CDM Trickle Data

trickle_entity_pathstring

full path to the CDM Trickle manifest file

dest_mntstring

mounting point where the data will be stored in the Data Lakehouse

entitystring

name of the entity/table to create in the Data Lakehouse

sourcestring

name of the system generating the source data

subject_areastring

name of the subject area within the source system

tablestring

name of the table to create - not implemented

execution_idstring

exection_id of the Azure Data Factory Pipeline to tie back to the ADF execution

full_flagboolean

overrides delta load logic

Raises:

Exception : bubbles up errors to the calling notebook or application

Returns

json :

returns the statistics associated with the Streaming of the bronze CDM data to the Silver zone of the Data Lakehouse, including, Expectation Statistics, and delta load statistics