I am trying to merge in delta table in vanilla version of spark with
if not DeltaTable.isDeltaTable(spark,delta_table_path): df.write.format("delta").mode("overwrite").option("mergeSchema", "true").save(delta_table_path) else: target=DeltaTable.forPath(spark, delta_table_path) matchOncondition=matchOnKey(key) matchOncondition=matchOncondition+""" AND target.is_Current='Y'AND target.isDeletedInSource='N' """ target.alias("target").merge(df.filter(df.rowNum==1).alias("source"),matchOncondition)\ .whenMatchedUpdate( set={"is_Current":lit("N"),"end_date":lit(current_timestamp())} ).execute() df.select('*').write.mode("Append").option("mergeSchema", "true").option("overwriteSchema", "true").format("delta").save(delta_table_path) spark = SparkSession.builder \ .appName("Delta Lake Example") \ .master("local[*]") \ .config('spark.jars.packages','io.delta:delta-core_2.12:2.4.0')\ .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \ .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \ .config("delta.enableTypeWidening", "true") \ .getOrCreate() Now my target schema has one of the column as of type double but source has it as string and it is giving me below error
Failed to merge fields 'customerNotificationId' and 'customerNotificationId'. Failed to merge incompatible data types DoubleType and StringType I believe with mergeSchema option as true and .config("delta.enableTypeWidening", "true") as true, it should evolve the schema. I do not want to evolve it manually.
pyspark version - 3.4 and delta-2.4