lakehouse.databricks.common.get_delta_floor_pred_for_sql

lakehouse.databricks.common.get_delta_floor_pred_for_sql(spark, mounting_point, file_path, lake_file_struct, file_type, delta_column, delta_column_type, default_floor=19000101, options=None)

A function to get the ceiling value for a data element in the data lake in incremental data loading strategies

Parameters

sparkspark context

spark context passed from the calling spark instance

mounting_point: string

mounting point where data is located

file_pathstring

file path to be appended to mount point

lake_file_structstring

lake file structure to be appeneded to file path when querying nested data

file_typestring

type of file to be read, delta, parquet, json, csv file types supported

delta_columnstring

name of the column to get the ceiling for

delta_column_typestring

data type of the column to get the ceiling for

default_floorint default=19000101

default floor value for the predicate column

Returns

json:

json object {"predicate_val": f"value"} - value is formatted based on the data type passed into the delta_column_type param - delta_column_types explicity supported:

  • datetime

  • datetime2

  • date

  • nvarchar

  • varchar

  • varbinary

  • uniqueidentifier

  • xml

  • nchar

  • bit

  • timestamp

  • informix_date

  • informix_datetime

  • all other delta_column_types implicity supported as numeric