1

I have values in dataframe , and I have created a table structure in Teradata. My requirement is to load dataframe to Teradata. But I am getting error:

I have tried following code :

df.write.format("jdbc") .option("driver","com.teradata.jdbc.TeraDriver") .option("url","organization.td.intranet") .option("dbtable",s"select * from td_s_zm_brainsdb.emp") .option("user","userid") .option("password","password") .mode("append") .save() 

I got an error :

java.lang.NullPointerException at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:93) at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:518) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215) ... 48 elided

I changed url option to make it similar to jdbc url, and ran following command:

df.write.format("jdbc") .option("driver","com.teradata.jdbc.TeraDriver") .option("url","jdbc:teradata//organization.td.intranet,CHARSET=UTF8,TMODE=ANSI,user=G01159039") .option("dbtable",s"select * from td_s_zm_brainsdb.emp") .option("user","userid") .option("password","password") .mode("append") .save() 

Still i am getting error:

java.lang.NullPointerException at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:93) at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:518) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215) ... 48 elided

I have included following jars:

with --jars option tdgssconfig-16.10.00.03.jar terajdbc4-16.10.00.03.jar teradata-connector-1.2.1.jar 

Version of Teradata 15 Spark version 2

3 Answers 3

1

Change the jdbc_url and dbtable to the following

 .option("url","jdbc:teradata//organization.td.intranet/Database=td_s_zm_brainsdb) .option("dbtable","emp") 

Also note in teradata, there are no row locks, so the above will create a table lock. i.e. it will not be efficient - parallel writes from sparkJDBC are not possible.

Native tools of teradata - fastloader /bteq combinations will work. Another option - that requires a complicated set up is Teradata Query Grid - this is super fast - Uses Presto behind the scene.

Sign up to request clarification or add additional context in comments.

5 Comments

1- I am using Dataframe writer, my requirement is to read from table. How 'select * from table' will make difference?
I updated it. spark.apache.org/docs/latest/sql-data-sources-jdbc.html describes the syntax of jdbc write as well.
Thanks for help. But with this configurations also, I am getting an error - java.lang.NullPointerException at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:93) at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:518) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215) ... 48 elided
Is your read working? Want to be sure you are able to connect to teradata
No, reading is also not working. It either means , I am missing some option in jdbc parameter , or version of jars are not in sync with version of teradata and hadoop. I am using teradata 14 and HDP 2.x
1

Below is code useful while reading data from Teradata table,

 df = (spark.read.format("jdbc").option("driver", "com.teradata.jdbc.TeraDriver") .option("url", "jdbc:teradata//organization.td.intranet/Database=td_s_zm_brainsdb") .option("dbtable", "(select * from td_s_zm_brainsdb.emp) AS t") .option("user", "userid") .option("password", "password") .load()) 

This will create data frame in Spark.

For writing data back to database below is statement,

Saving data to a JDBC source

jdbcDF.write \ .format("jdbc") \ .option("url", "jdbc:teradata//organization.td.intranet/Database=td_s_zm_brainsdb") \ .option("dbtable", "schema.tablename") \ .option("user", "username") \ .option("password", "password") \ .save() 

1 Comment

I have to write data back to teradata, not to read from teradata.
1

The JDBC Url should be in the following form :

val jdbcUrl = s"jdbc:teradata://${jdbcHostname}/database=${jdbcDatabase},user=${jdbcUsername},password=${jdbcPassword}" 

It was causing an exception, because I didn't supply username and password.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.