Write dataframe to Teradata table from Spark

Question

So far, I am able to read dataframe from Teradata using Teradata jdbc connector for Spark. Syntax for reading is as follows :

val df = hc.read.format("jdbc").options( Map( "url" -> url, "dbtable" -> (sel * from tableA) as data, "driver" -> "com.teradata.jdbc.TeraDriver" ) ).load()

where hc = hiveContext, url = connection url for teradata

I want to save a dataframe to Teradata table. I tried using the above syntax by changing dbtable to insert statement ,

 val df = hc.read.format("jdbc").options( Map( "url" -> url, "dbtable" -> (insert into db.tabA values (1,2,3)) as data, "driver" -> "com.teradata.jdbc.TeraDriver" ) ).load()

But the above statement gave me an error :

Error: Exception in thread "main" java.sql.SQLException: [Teradata Database] [TeraJDBC 15.10.00.22] [Error 3706] [SQLState 42000] Syntax error: expected something between '(' and the 'insert' keyword.

I want to save a dataframe to Teradata in Spark, what is the best possible way of doing it?

The SQL Exception is Teradata complaining about receiving an "(insert..." command (it doesn't want the parenthesis). Try "dbtable" -> "insert into db.tabA values (1,2,3)", but I think there's something else you'll have to check: I'm not a Spark expert, but it looks strange that you have to use a "read" method to "write" into a database. — Insac
– Insac, Commented Nov 15, 2016 at 15:49
I've found an example (sparkexpert.com/2015/04/17/…). In your example you don't have a Dataframe. You first need to create your dataframe with some data (the "1,2,3" you put into the insert) and then use the "insertIntoJDBC" method. — Insac
– Insac, Commented Nov 15, 2016 at 16:02
Thanks @Insac . I have found a way to write dataframe to Teradata. I am using ScalikeJDBC for creating JDBC connection to Teradata and writing via its api. — Anchika Agarwal
– Anchika Agarwal, Commented Nov 16, 2016 at 6:55
Good! Are you going to input your solution as an answer? This way, others with the same issue will be able to solve it, and you might receive comments that can help you improve the solution. — Insac
– Insac, Commented Nov 16, 2016 at 7:48

Ram Ghadiyaram · Accepted Answer · 2016-11-15 13:43:34Z

AFAIK as data is not correct , remaining seems correct to my eyes.

"dbtable" -> (insert into db.tabA values (1,2,3)) as data,

with

"dbtable" -> (insert into db.tabA values (1,2,3)) ,

Below should work with out any hassle.

val df = hc.read.format("jdbc").options( Map( "url" -> url, "dbtable" -> (insert into db.tabA values (1,2,3)), "driver" -> "com.teradata.jdbc.TeraDriver" ) ).load()

please see 3706-Syntax-error-expected-something-between-and-For-derived/td-p/1286 Moreover your insert seems like example insert not real... try to execute insert as it is in terada. seems like its teradata error. in the above link there is a problem with aliasing(see the bottom of the link page which fixed). check that as well.

Anchika Agarwal · Accepted Answer · 2016-11-16 08:58:04Z

I was able to write data into Teradata table using Scalikejdbc. I have used batch update for storing the results.

Sample code for inserting batch rows using ScalikeJdbc:

 DB localTx { implicit session => val batchParams: Seq[Seq[Any]] = (2001 to 3000).map(i => Seq(i, "name" + i)) withSQL { insert.into(Emp).namedValues(column.id -> sqls.?, column.name -> sqls.?) }.batch(batchParams: _*).apply() }

This is an alternative, you mean, there is no way to insert rows from spark jdbc?
I found this one quite efficient, so went forward with it. It may also be possible via spark jdbc, but currently I am not aware of it.

Collectives™ on Stack Overflow

Write dataframe to Teradata table from Spark

2 Answers 2

1 Comment

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Related