lakehouse.databricks.cdm.process_cdm_entity
- lakehouse.databricks.cdm.process_cdm_entity(spark, cdm_path, dst_mount, dst_zone, dst_schema, source, lake_subject, entity, execution_id=None, sql_override=None)
This function Processes a Common Data Model Entity and converts to a Data Lakehouse table
Parameters
- sparkspark context
spark context passed from the calling spark instance
- cdm_pathstring
path to the CDM manifest file
- dest_mntstring
mounting point where the data will be stored in the Data Lakehouse
- dst_zonestring
Data Lakehouse zone where the table will be stored
- dst_schemastring
database/schema in the Data Lakehouse where the Delta Table will be stored
- sourcestring
source system that generated the data
- lake_subjectstring
subject area within the source system that generated the data
- entitystring
namee of the table/entity to create
- execution_idstring, default=None
exection_id of the Azure Data Factory Pipeline to tie back to the ADF execution
- sql_overridestring, defaul=None
a sql statement used to override the sql defined for the CDM entity
Returns
- json :
returns the statistics associated with the creation of the silver CDM entity in the Data Lakehouse, including, Expectation Statistics, and delta load statistics