update table from Pyspark using JDBC

Question

I have a small log dataframe which has metadata regarding the ETL performed within a given notebook, the notebook is part of a bigger ETL pipeline managed in Azure DataFactory.

Unfortunately, it seems that Databricks cannot invoke stored procedures so I'm manually appending a row with the correct data to my log table.

however, I cannot figure out the correct sytnax to update a table given a set of conditions :

the statement I use to append a single row is as follows :

spark_log.write.jdbc(sql_url, 'internal.Job',mode='append')

this works swimmingly however, as my Data Factory is invoking a stored procedure,

I need to work in a query like

query = f""" UPDATE [internal].[Job] SET [MaxIngestionDate] date {date} , [DataLakeMetadataRaw] varchar(MAX) NULL , [DataLakeMetadataCurated] varchar(MAX) NULL WHERE [IsRunning] = 1 AND [FinishDateTime] IS NULL"""

Is this possible ? if so can someone show me how?

Looking at the documentation this only seems to mention using select statements with the query parameter :

Target Database is an Azure SQL Database.

https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html

just to add this is a tiny operation, so performance is a non-issue.

to any lost souls wondering here, my work around was to pass a json blob on the completion of the notebook in my datafactory pipeline which i then parsed out and pass as parameters to my Stored Proc which in turn updated my log tables. — Umar.H
– Umar.H, Commented Apr 13, 2021 at 12:40

David Moore · Accepted Answer · 2020-01-11 03:54:56Z

You can't do single record updates using jdbc in Spark with dataframes. You can only append or replace the entire table.

You can do updates using pyodbc- requires installing the MSSQL ODBC driver (How to install PYODBC in Databricks) or you can use jdbc via JayDeBeApi (https://pypi.org/project/JayDeBeApi/)

Collectives™ on Stack Overflow

update table from Pyspark using JDBC

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related