I have a small log dataframe which has metadata regarding the ETL performed within a given notebook, the notebook is part of a bigger ETL pipeline managed in Azure DataFactory.
Unfortunately, it seems that Databricks cannot invoke stored procedures so I'm manually appending a row with the correct data to my log table.
however, I cannot figure out the correct sytnax to update a table given a set of conditions :
the statement I use to append a single row is as follows :
spark_log.write.jdbc(sql_url, 'internal.Job',mode='append') this works swimmingly however, as my Data Factory is invoking a stored procedure,
I need to work in a query like
query = f""" UPDATE [internal].[Job] SET [MaxIngestionDate] date {date} , [DataLakeMetadataRaw] varchar(MAX) NULL , [DataLakeMetadataCurated] varchar(MAX) NULL WHERE [IsRunning] = 1 AND [FinishDateTime] IS NULL""" Is this possible ? if so can someone show me how?
Looking at the documentation this only seems to mention using select statements with the query parameter :
Target Database is an Azure SQL Database.
https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html
just to add this is a tiny operation, so performance is a non-issue.