4

In practice, keys have to be unique within a JSON object (e.g. Does JSON syntax allow duplicate keys in an object?). However, suppose I have a file with the following contents:

{ "a" : "1", "b" : "2", "a" : "3" } 

Is there a simple way of converting the repeated keys to an array? So that the file becomes:

{ "a" : [ {"key": "1"}, {"key": "3"}], "b" : "2" } 

Or something similar, but which combines the repeated keys into an array (or finds and alternative way to extract the repeated key values).

Here's a solution in Java: Convert JSON object with duplicate keys to JSON array

Is there any way to do it with awk/bash/python?

10
  • Where does this json come from? Do you have access to it from the server side, or as a string, before it is evaluated? Once a json becomes a JS object I don't think you can do anything (perhaps I am wrong) Commented Apr 30, 2016 at 15:27
  • I generate the json... in principle I could do jq -c '.' and that would output it as a one-line string. Commented Apr 30, 2016 at 15:29
  • 1
    Possible duplicate of Convert JSON object with duplicate keys to JSON array Commented Apr 30, 2016 at 15:34
  • See if these answer your question: stackoverflow.com/questions/24416960/… ; stackoverflow.com/questions/17063257/… Commented Apr 30, 2016 at 15:35
  • @Shovalt: thanks for this link, I didn't see it. However, that answer is in java... Commented Apr 30, 2016 at 15:53

4 Answers 4

6

If your input is really a flat JSON object with primitives as values, this should work:

jq -s --stream 'group_by(.[0]) | map({"key": .[0][0][0], "value": map(.[1])}) | from_entries' { "a": [ "1", "3" ], "b": [ "2" ] } 

For more complex outputs, that would require actually understanding how --stream is supposed to be used, which is beyond me.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! This works for the example I posted, and almost works on my actual data... this approach groups the results by the key name (which makes perfect sense), but then in my data I have some odd cases where value is available for some keys, but not the others and the order matters... I may not be writing it correctly, but in any case, the code you provided will help me to solve the actual problem (until I manage to generate proper JSONs in the first instance).
As I said in the comments to your initial post, it's probably way easier to generate simple JSONs than it is to do jq magic on complex JSONs.
4

Building on Santiago's answer using -s --stream, the following filter builds up the object one step at a time, thus preserving the order of the keys and of the values for a specific key:

reduce (.[] | select(length==2)) as $kv ({}; $kv[0][0] as $k |$kv[1] as $v | (.[$k]|type) as $t | if $t == "null" then .[$k] = $v elif $t == "array" then .[$k] += [$v] else .[$k] = [ .[$k], $v ] end) 

For the given input, the result is:

{ "a": [ "1", "3" ], "b": "2" } 

To illustrate that the ordering of values for each key is preserved, consider the following input:

{ "c" : "C", "a" : "1", "b" : "2", "a" : "3", "b" : "1" } 

The output produced by the filter above is:

{ "c": "C", "a": [ "1", "3" ], "b": [ "2", "1" ] } 

Comments

1

Building up on peak's answer, the following filter also works on multi object-input, with nested objects and without the slurp-Option (-s).

This is not an answer to the initial question, but because the jq-FAQ links here it might be useful for some visitors

File jqmergekeys.txt

def consumestream($arr): # Reads stream elements from stdin until we have enough elements to build one object and returns them as array input as $inp | if $inp|has(1) then consumestream($arr+[$inp]) # input=keyvalue pair => Add to array and consume more elif ($inp[0]|has(1)) then consumestream($arr) # input=closing subkey => Skip and consume more else $arr end; # input=closing root object => return array def convert2obj($stream): # Converts an object in stream notation into an object, and merges the values of duplicate keys into arrays reduce ($stream[]) as $kv ({}; # This function is based on http://stackoverflow.com/a/36974355/2606757 $kv[0] as $k | $kv[1] as $v | (getpath($k)|type) as $t # type of existing value under the given key | if $t == "null" then setpath($k;$v) # value not existing => set value elif $t == "array" then setpath($k; getpath($k) + [$v] ) # value is already an array => add value to array else setpath($k; [getpath($k), $v ]) # single value => put existing and new value into an array end); def mainloop(f): (convert2obj(consumestream([input]))|f),mainloop(f); # Consumes streams forever, converts them into an object and applies the user provided filter def mergeduplicates(f): try mainloop(f) catch if .=="break" then empty else error end; # Catches the "break" thrown by jq if there's no more input #---------------- User code below -------------------------- mergeduplicates(.) # merge duplicate keys in input, without any additional filters #mergeduplicates(select(.layers)|.layers.frame) # merge duplicate keys in input and apply some filter afterwards 

Example:

tshark -T ek | jq -nc --stream -f ./jqmergekeys.txt 

Comments

1

Here's a simple alternative that generalizes well:

reshape.jq

def augmentpath($path; $value): getpath($path) as $v | setpath($path; $v + [$value]); reduce (inputs | select(length==2)) as $pv ({}; augmentpath($pv[0]; $pv[1]) ) 

Usage

jq -n -f reshape.jq input.json 

Output

With the given input:

{ "a": [ "1", "3" ], "b": [ "2" ] } 

Postscript

If it's important to avoid arrays of singletons, either the def of augmentpath could be modified, or a postprocessing step could be added.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.