0

I am doing one RND where i want to store my RDD to hive table. I have wirtten the code in Java and creating the RDD. After converting the RDD i am converting it to Data Frame and then store it in Hive table. But here i am facing two kind of different errors.

 public static void main(String[] args) { SparkConf sparkConf = new SparkConf().setAppName("SparkMain"); JavaSparkContext ctx = new JavaSparkContext(sparkConf); HiveContext hiveContext = new HiveContext(ctx.sc()); hiveContext.setConf("hive.metastore.uris", "thrift://address:port"); DataFrame df = hiveContext.read().text("/filepath"); df.write().saveAsTable("catAcctData"); df.registerTempTable("catAcctData"); DataFrame sql = hiveContext.sql("select * from catAcctData"); sql.show(); ctx.close(); 

}

If i am executing this program, it is working perfectly fine. I can see the table data in console.

But if i try below code it is saying org.apache.spark.sql.AnalysisException: Table not found: java

 public static void main(String[] args) { SparkConf sparkConf = new SparkConf().setAppName("SparkMain"); JavaSparkContext ctx = new JavaSparkContext(sparkConf); HiveContext hiveContext = new HiveContext(ctx.sc()); hiveContext.setConf("hive.metastore.uris", "thrift://address:port"); DataFrame sql = hiveContext.sql("select * from catAcctData"); sql.show(); ctx.close(); 

}

And if i try to save the table data using sqlContext it is saying java.lang.RuntimeException: Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

 public static void main(String[] args) { SparkConf sparkConf = new SparkConf().setAppName("SparkMain"); JavaSparkContext ctx = new JavaSparkContext(sparkConf); SQLContext hiveContext = new SQLContext(ctx.sc()); hiveContext.setConf("hive.metastore.uris", "thrift://address:port"); DataFrame df = hiveContext.read().text("/filepath"); df.write().saveAsTable("catAcctData"); df.registerTempTable("catAcctData"); DataFrame sql = hiveContext.sql("select * from catAcctData"); sql.show(); ctx.close(); 

}

I am bit confuse here. Please solve my query.

Regards, Pratik

1 Answer 1

1

Your problem is that you create your table using different HiveContext. In other words, HiveContext from the second program doesn't see "catAcctData" table because you've created this table with another HiveContext. Use one HiveContext for creating and reading tables.

Also I don't understand why you do this df.write().saveAsTable("catAcctData"); before creating temporary table. If you want to create temporary table you just need to use df.registerTempTable("catAcctData"); withoutdf.write().saveAsTable("catAcctData");.

Sign up to request clarification or add additional context in comments.

4 Comments

so Yehor Krivokon how can i read the previously created table ? Can you guide me please ?
1) HiveContext is deprecated. Add Hive support using: SparkSession.builder().enableHiveSupport(); 2) Create tables using previously created SQLContext. Use: SQLContext.getOrCreate(spark.sparkContext()); 3) Get SparkSession using this: SparkSession spark = SparkSession.builder().enableHiveSupport().getOrCreate();
But i have one constrain that i can not use spark2.0 i have only spark 1.6 install in hadoop so i can not use sparksession.
Ok, you can get sparkContext using getOrcreate method, not from SparkSession.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.