Spark SQL Insert Select with a column list?

Question

As I read Spark/Hive SQL documentation is appears that Insert into table with a Column List is not supported in Spark 2.4 and earlier versions.

I have a source table and a destination table with different number of columns and different column names which I need to copy.

Does this mean I have to code this in PySpark to do this job as Spark SQL will not be able to do it ??

Example:

input_table( cola, colb, colc, cold, cole) output_table(fieldx, fieldy, fieldz)

In SQL (assuming RDBMS such as MS-SQL, PostgreSQL etc) I would do the following:

insert into output_table(fieldx, fieldy, fieldz) select cola, colb, colc from input_table

Spark SQL does not allow this, it does not accept a column list in Insert SQL statement.

Question: how can I do this task with minimum of code and maximum performance in either PySpark or (ideally) in Spark-SQL (I am using Spark 2.4) ?

thank you

Is this something that is updated in a later version of Spark (> 2.4) where you can specify columns? Not supporting this basic sql syntax seems pretty poor... — Chris Knoll
– Chris Knoll, Commented Jan 23, 2020 at 18:38

Vamsi Prabhala · Accepted Answer · 2019-01-09 01:35:18Z

Specify the columns in output that won't be copied from input_table as null in select. (This is what would happen when only a set of columns, not all, would be inserted with a column list, if it were allowed)

insert into output_table select cola, colb, colc,null as other1,--..specify non-copied column values as null from input_table

thanks, this is what I started to do, will this perform optimally? I need to use some SQL functions/expressions on some of the select columns to transform them such as to uppercase some values, to extract some sub-strings, cast other columns, etc. The data source is huge, many billions of rows.....

Collectives™ on Stack Overflow

Spark SQL Insert Select with a column list?

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related