19

How can I execute lengthy, multiline Hive Queries in Spark SQL? Like query below:

val sqlContext = new HiveContext (sc) val result = sqlContext.sql (" select ... from ... "); 
1
  • 4
    please improve your post, nobody wants to see screenshots of code Commented Nov 24, 2016 at 14:03

6 Answers 6

27

Use """ instead, so for example

val results = sqlContext.sql (""" select .... from .... """); 

or, if you want to format code, use:

val results = sqlContext.sql (""" |select .... |from .... """.stripMargin); 
Sign up to request clarification or add additional context in comments.

3 Comments

what is that pipe symbol for?
Pipe is needed for the stripMargin, e.g. see oreilly multiline strings
@KenJiiii You are right. I haven't replied instantly and forgot later, thanks for this clarification
6

You can use triple-quotes at the start/end of the SQL code or a backslash at the end of each line.

val results = sqlContext.sql (""" create table enta.scd_fullfilled_entitlement as select * from my_table """); results = sqlContext.sql (" \ create table enta.scd_fullfilled_entitlement as \ select * \ from my_table \ ") 

2 Comments

Triple quotes (both double and single) can be used in Python as well. Also backslashes are obsolete.
Thanks, edited. Obsolete? Not exactly, according to the Stype Guide python.org/dev/peps/pep-0008
1
val query = """(SELECT a.AcctBranchName, c.CustomerNum, c.SourceCustomerId, a.SourceAccountId, a.AccountNum, c.FullName, c.LastName, c.BirthDate, a.Balance, case when [RollOverStatus] = 'Y' then 'Yes' Else 'No' end as RollOverStatus FROM v_Account AS a left join v_Customer AS c ON c.CustomerID = a.CustomerID AND c.Businessdate = a.Businessdate WHERE a.Category = 'Deposit' AND c.Businessdate= '2018-11-28' AND isnull(a.Classification,'N/A') IN ('Contractual Account','Non-Term Deposit','Term Deposit') AND IsActive = 'Yes' ) tmp """ 

Comments

0

It is worth noting that the length is not the issue, just the writing. For this you can use """ as Gaweda suggested or simply use a string variable, e.g. by building it with string builder. For example:

val selectElements = Seq("a","b","c") val builder = StringBuilder.newBuilder builder.append("select ") builder.append(selectElements.mkString(",")) builder.append(" where d<10") val results = sqlContext.sql(builder.toString()) 

1 Comment

Without val in lines with append :)
0

In addition to the above ways, you can use the below-mentioned way as well:

val results = sqlContext.sql("select .... " + " from .... " + " where .... " + " group by .... "); 

Comments

0

Write your sql inside triple quotes, like """ sql code """

df = spark.sql(f""" select * from table1 """) 

This is same for Scala Spark and PySpark.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.