3

I want to calculate the frequency distribution(return most common element in each column and the number of times it appeared) of a dataframe using spark and scala. I've tried using DataFrameStatFunctions library but after I filter my dataframe for only numeric type columns, I cant apply any functions from the library. Is the best way to do this to create a UDF?

1 Answer 1

12

you can use val newDF = df.groupBy("columnName").count() newDF.show()

it will show you the frequency count for unique entries.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.