2
\$\begingroup\$

Today I spent quite some time on providing a solution for a Stack Overflow answer. It looked quite easy to me, but my solution is complex.

The question can be found here and this is my answer.

If I understood the question correctly it came down to obtaining all values for keyWeWants from a data structure like this:

{ "agg": { "agg1": [ { "keyWeWant": "*-20.0", "asdf": 0, "asdf": 20, "asdf": 14, "some_nested_agg": [ { "keyWeWant2": 20, "to": 25, "doc_count": 4, "some_nested_agg2": { "count": 7, "min": 2, "max": 5, "keyWeWant3": 2.857142857142857, "sum": 20 } }, { "keyWeWant2": 25, "to": 30, "doc_count": 10, "some_nested_agg2": { "count": 16, "min": 2, "max": 10, "keyWeWant3": 6.375, "sum": 102 } }] }]} } 

The parsed structure should look like this:

[ { "keyWeWant" : "*-20", "keyWeWant2" : 20, "keyWeWant3" : 2.857142857142857 }, { "keyWeWant" : "*-20", "keyWeWant2" : 25, "keyWeWant3" : 6.375 }, { ... }, { ... } ] 

In the question it is requested that the function looks something like this:

function_name(data_map, { "keyWeWant" : ['agg', 'agg1'], "keyWeWant2" : ['agg', 'agg1', 'some_nested_agg'], "keyWeWant" : ['agg', 'agg1', 'some_nested_agg', 'some_nested_agg2'] }) 

I took the challenge to solve it exactly with the data structure provided in the question. This was probably not the best approach and I don't think my solution is optimal.

Here's my solution (selectively copied from the answer):

I placed this test data in a file called data.json. Then Cheshire JSON library parses the data to a Clojure data structure:

(use '[cheshire.core :as cheshire]) (def my-data (-> "data.json" slurp cheshire/parse-string)) 

Next the paths to get are defined as follows:

(def my-data-map {"keyWeWant" ["agg", "agg1"], "keyWeWant2" ["agg", "agg1", "some_nested_agg"], "keyWeWant3" ["agg", "agg1", "some_nested_agg", "some_nested_agg2"]}) 

It is the data_map of the question without ":", single quotes changed to double quotes and the last "keyWeWant" changed to "keyWeWant3".

find-nested below has the semantics of Clojure's get-in, only then it works on maps with vectors, and returns all values instead of one. When find-nested is given a search vector it finds all values in a nested map where some values can consist of a vector with a list of maps. Every map in the vector is checked.

(defn find-nested "Finds all values in a coll consisting of maps and vectors. All values are returned in a tree structure: i.e, in your problem it returns (20 25) if you call it with (find-nested ['agg', 'agg1', 'some_nested_agg', 'keyWeWant2'] my-data). Returns nil if not found." [ks c] (let [k (first ks)] (cond (nil? k) c (map? c) (find-nested (rest ks) (get c k)) (vector? c) (if-let [e (-> c first (get k))] (if (string? e) e ; do not map over chars in str (map (partial find-nested (rest ks)) e)) (find-nested ks (into [] (rest c)))) ; create vec again :else nil))) 

find-nested finds the values for a search path:

(find-nested ["agg", "agg1", "some_nested_agg", "keyWeWant2"] my-data) ; => (20 25) 

If all the paths towards the "keyWeWant's are mapped over my-data these are the slices of a tree:

(*-20.0
(20 25)
(2.857142857142857 6.375))

The requested structure for (all end results with paths getting there) can be obtained from this tree in function-name like this:

(defn function-name "Transforms data d by finding (nested keys) via data-map m in d and flattening the structure." [d m] (let [tree (map #(find-nested (conj (second %) (first %)) d) m) leaves (last tree) leaf-indices (range (count leaves)) results (for [index leaf-indices] (map (fn [slice] (if (string? slice) slice (loop [node (nth slice index)] (if node node (recur (nth slice (dec index))))))) tree)) results-with-paths (mapv #(zipmap (keys m) %) results) json (cheshire/encode results-with-paths)] json)) 

results uses a loop to step back if a leaf-index is larger than that particular slice. I think it will work out for deeper nested structures as well -if a next slice is always double the size of a previous slice or the same size it should work out -, but I have not tested it.

Calling (function-name my-data my-data-map) leads to a JSON string in your requested format:

[{
     "keyWeWant": "-20.0",
     "keyWeWant2": 20,
     "keyWeWant3": 2.857142857142857 }
 {
     "keyWeWant": "
-20.0",
     "keyWeWant2" 25,
     "keyWeWant3" 6.375 }]

To improve my Clojure code (e.g., succinctness) I would love to be pointed towards better solutions.

\$\endgroup\$

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.