AI Assist is now on Stack Overflow. Start a chat to get instant answers from across the network. Sign up to save and share your chats.

1. Home
2. Questions
3. AI Assist
4. Tags
6. Challenges
7. Chat
8. Articles
9. Users
11. Jobs
12. Companies
13. Collectives
14. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Stack Internal
Bring the best of human thought and AI automation together at your work. Learn more

flatten array within a Dataframe in Spark

Asked 6 years, 8 months ago

Modified 6 years, 8 months ago

Viewed 12k times

5

How can i flatten array into dataframe that contain colomns [a,b,c,d,e]

root |-- arry: array (nullable = true) | |-- element: struct (containsNull = true) | | |-- a string (nullable = true) | | |-- b: long (nullable = true) | | |-- c: string (nullable = true) | | |-- d: string (nullable = true) | | |-- e: long (nullable = true)

Any help is appreciated.

asked Apr 2, 2019 at 16:28

user5158444

Add a comment |

1 Answer 1

Sorted by:

11

Say, you have a json with the following structure:

{ "array": [ { "a": "asdf", "b": 1234, "c": "a", "d": "str", "e": 1234 }, { "a": "asdf", "b": 1234, "c": "a", "d": "str", "e": 1234 }, { "a": "asdf", "b": 1234, "c": "a", "d": "str", "e": 1234 } ] }

Read the file

scala> val nested = spark.read.option("multiline",true).json("nested.json") nested: org.apache.spark.sql.DataFrame = [array: array<struct<a:string,b:bigint,c:string,d:string,e:bigint>>]

Check the schema

scala> nested.printSchema root |-- array: array (nullable = true) | |-- element: struct (containsNull = true) | | |-- a: string (nullable = true) | | |-- b: long (nullable = true) | | |-- c: string (nullable = true) | | |-- d: string (nullable = true) | | |-- e: long (nullable = true)

Use explode function

scala> nested.select(explode($"array").as("exploded")).select("exploded.*").show +----+----+---+---+----+ | a| b| c| d| e| +----+----+---+---+----+ |asdf|1234| a|str|1234| |asdf|1234| a|str|1234| |asdf|1234| a|str|1234| +----+----+---+---+----+

answered Apr 2, 2019 at 16:51

4204 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

Comments