0

I have the following JSON file.

{ "reviewerID": "ABC1234", "productID": "ABCDEF", "reviewText": "GOOD!", "rating": 5.0, }, { "reviewerID": "ABC5678", "productID": "GFMKDS", "reviewText": "Not bad!", "rating": 3.0, } 

I want to parse without SparkSQL and use a JSON parser.

The result of parse that i want is textfile.

ABC1234::ABCDEF::5.0 ABC5678::GFMKDS::3.0 

How to parse the json file by using json parser in spark scala?

1 Answer 1

1

tl;dr Spark SQL supports JSONs in the format of one JSON per file or per line. If you'd like to parse multi-line JSONs that can appear together in a single file, you'd have to write your own Spark support as it's not currently possible.

A possible solution is to ask the "writer" (the process that writes the files to be nicer and save one JSON per file) that would make your life much sweeter.

If that does not give you much, you'd have to use mapPartitions transformation with your parser and somehow do the parsing yourself.

val input: RDD[String] = // ... load your JSONs here val jsons = jsonRDD.mapPartitions(json => // ... use your JSON parser here) 
Sign up to request clarification or add additional context in comments.

2 Comments

how we make sure that each json will be only be in separate partition may be both the jsons will be appear in the same partition.
sorry... I'm not understand TT .. Can you explain more specific?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.