0

There is a DataFrame as following:

import spark.implicits._ val df = List( ("id1","blue","1") ,("id2","red","2") ,("id3","red","3") ,("id4","blue","3") ,("id4","green","3") ).toDF("id", "color", "size") +---+-----+----+ | id|color|size| +---+-----+----+ |id1| blue| 1| |id2| red| 2| |id3| red| 3| |id4| blue| 3| |id4|green| 3| +---+-----+----+ 

There is a Seq[org.apache.spark.sql.Column] and it can sort df as following:

import org.apache.spark.sql.Column val col = Seq(new Column("size"), new Column("color")) df.sort(col:_*).show 

But I want to sort by col with descending order.

import org.apache.spark.sql.functions.desc 

df.sort(desc(col:_*)) does not work.

Then how to sort df by col in descending order?

1
  • 1
    You can apply asc/desc for sorting on individual columns, like, val col = Seq(new Column("size").asc, new Column("color").desc). Commented Jan 9, 2020 at 4:31

1 Answer 1

3

You can use col.map(_.desc) to construct the sorting expressions with desc ordering:

val col = Seq(new Column("size"), new Column("color")) // ascending df.sort(col: _*).show +---+-----+----+ | id|color|size| +---+-----+----+ |id1| blue| 1| |id2| red| 2| |id4| blue| 3| |id4|green| 3| |id3| red| 3| +---+-----+----+ // descending df.sort(col.map(_.desc): _*).show +---+-----+----+ | id|color|size| +---+-----+----+ |id3| red| 3| |id4|green| 3| |id4| blue| 3| |id2| red| 2| |id1| blue| 1| +---+-----+----+ 

Here col.map(_.desc) returns a list of expressions:

col.map(_.desc) // res2: Seq[org.apache.spark.sql.Column] = // List(size DESC NULLS LAST, color DESC NULLS LAST) 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.