How to merge 2 JSON files including objects and arrays using jq?

Question

I'm using jq to try and merge 2 json files into one unique file.

The result is close to what I was looking for, but not just right.

File 1:

{ "series": "Harry Potter Movie Series", "writer": "J.K. Rowling", "movies": [ { "title": "Harry Potter and the Philosopher's Stone", "actors": [ { "names": [ "Emma Watson", "Other actor" ], "other": "Some value" } ] }, { "title": "Harry Potter and the Chamber of Secrets", "actors": [ { "names": [ "Emma Watson" ], "other": "Some value" } ] } ] }

File 2:

{ "series": "Harry Potter Movie Series", "producer": "David Heyman", "movies": [ { "title": "Harry Potter and the Philosopher's Stone", "year": "2001" }, { "title": "Harry Potter and the Chamber of Secrets", "year": "2002" } ] }

Expected result:

{ "series": "Harry Potter Movie Series", "writer": "J.K. Rowling", "movies": [ { "title": "Harry Potter and the Philosopher's Stone", "year": "2001", "actors": [ { "names": [ "Emma Watson", "Other actor" ], "other": "Some value" } ] }, { "title": "Harry Potter and the Chamber of Secrets", "year": "2001", "actors": [ { "names": [ "Emma Watson" ], "other": "Some value" } ] } ], "producer": "David Heyman" }

Best result I've got so far (only arrays with actors are missing):

{ "series": "Harry Potter Movie Series", "writer": "J.K. Rowling", "movies": [ { "title": "Harry Potter and the Philosopher's Stone", "year": "2001" }, { "title": "Harry Potter and the Chamber of Secrets", "year": "2002" } ], "producer": "David Heyman" }

Using one of the commands below:

jq -s '.[0] * .[1]' file1 file2

jq --slurp 'add' file1 file2

jq '. * input' file1 file2

If I switch order of files I either end up losing 'actors' from file1 or 'year' from file2.

How it should work:

the elements in file 2 will be leading and should replace the matching elements in file 1.
the elements in file 1 that doesn't exist in file 2 (like writer and movies[].actors elements) shouldn't be deleted
the elements in file 2 that doesn't exist yet in file 1 will be added (like producer and movies[].year).
a title is unique and should by default not occur more then once, but if it does remove the duplicates.

I would assume there is a solution to get these movies arrays perfectly merged with jq.

All of the three commands produce your expected result. Maybe you flipped the order of the files (order matters). If it's the order of the fields within the object bothering you, try adding ` | {series, writer, movies, producer}` to whichever command you prefer (although comparisonwise there is no such thing as an order of fields in an object). You also may want to have a look at this question. — pmf
– pmf, Commented Jun 9, 2022 at 12:50
Note that the first and the last of the three commands use * (not +) for a deep merge, while the middle one uses add which iterates through the array using +, thus it is just a top-level merge. Iteration through the slurped files using * would be jq --slurp 'reduce .[] as $i ({}; . * $i)' file1 file2 (only useful for more than two or a variable number of files, otherwise .[0] * .[1] is just as good). — pmf
– pmf, Commented Jun 9, 2022 at 13:27
Thanks for reaching out - I added 'year' to file2 to point out the problem more specifically. If I switch order of files I either lose 'actors' from file1 or 'year' from file2. — freljord
– freljord, Commented Jun 9, 2022 at 14:18
You surely do because the latter overwrites the former. If you want arrays (not objects) to be merged, describe the mechanism you envision for such an operation. Should the elements be added up (giving you title twice), should duplicates be removed (what if one file alone already contains duplicates), ‌...? — pmf
– pmf, Commented Jun 9, 2022 at 14:40
The values in file 2 will be leading (except for the writer and movies[].actors elements). All the movie elements in file 2 should replace the matching elements in file 1. If 'year' doesn't exist yet in file 1, it should be added. Titles should be unique from itself and may not occur more then once, but if it does duplicaties should be removed. — freljord
– freljord, Commented Jun 9, 2022 at 14:57

pmf · Accepted Answer · 2022-06-11 12:37:14Z

You are looking for a solution that "merges" objects and arrays. For the former you have already found + (or add) for a top-level merge, and * for a recursive merge, but merging arrays (namely the two .movies fields) needs more specification from your end as there is no canonical solution for that.

In a comment you state

.movies[0] always correspond to the same movie in both files

This enables you to use transpose to align the items from both arrays, and then apply object-merging on each pair of corresponding items. If you want to merge deeper arrays as well (e.g. .movies[].actors or .movies[].actors[].names) you need to extend this approach accordingly. Here's a solution using plain add for the merging of the array items as well as of the other top-level fields:

jq -s 'add + {movies: map(.movies) | transpose | map(add)}' file1 file2

{ "series": "Harry Potter Movie Series", "writer": "J.K. Rowling", "movies": [ { "title": "Harry Potter and the Philosopher's Stone", "actors": [ { "names": [ "Emma Watson", "Other actor" ], "other": "Some value" } ], "year": "2001" }, { "title": "Harry Potter and the Chamber of Secrets", "actors": [ { "names": [ "Emma Watson" ], "other": "Some value" } ], "year": "2002" } ], "producer": "David Heyman" }

Demo

Collectives™ on Stack Overflow

How to merge 2 JSON files including objects and arrays using jq?

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related