lakehouse.databricks.cdm.process_cdm_entity

lakehouse.databricks.cdm.process_cdm_entity(spark, cdm_path, dst_mount, dst_zone, dst_schema, source, lake_subject, entity, execution_id=None, sql_override=None)

This function Processes a Common Data Model Entity and converts to a Data Lakehouse table

Parameters

sparkspark context
spark context passed from the calling spark instance

cdm_pathstring
path to the CDM manifest file

dest_mntstring
mounting point where the data will be stored in the Data Lakehouse

dst_zonestring
Data Lakehouse zone where the table will be stored

dst_schemastring
database/schema in the Data Lakehouse where the Delta Table will be stored

sourcestring
source system that generated the data

lake_subjectstring
subject area within the source system that generated the data

entitystring
namee of the table/entity to create

execution_idstring, default=None
exection_id of the Azure Data Factory Pipeline to tie back to the ADF execution

sql_overridestring, defaul=None
a sql statement used to override the sql defined for the CDM entity

Returns

json :
returns the statistics associated with the creation of the silver CDM entity in the Data Lakehouse, including, Expectation Statistics, and delta load statistics