-1

How can I reference an associative array and reassign its content from serialized string?

(The eval requirement is now spelled out explicitly in the title, but this is natural when it comes to deserialization.)

Important: The serialized string comes as-is from another source (assume e.g. read from file), the below code demonstrates how it was created, but the variable aa_original is NOT available in my actual code.

As it was once created with ${aa[@]@K} parameter expansion, it can contain $'...' constructs:

${parameter@K} Produces a possibly-quoted version of the value of parameter, except that it prints the values of indexed and associative arrays as a sequence of quoted key-value pairs (see Arrays).

#!/bin/bash func() { declare -n aa=$1 # `aa=()` can be used to blank out the referenced array # before populating it, but skipping it here # to see when reference is lost # of the below, only `serialized` string is available in actual code declare -A aa_original=("a" "3" "b" "4") serialized=$(printf '%s' "${aa_original[@]@K}") aa="( $serialized )" # below only for debugging purposes echo "~ at the end of func() now" echo "~ serialized: $serialized" echo "~ aa_original:" for key in "${!aa_original[@]}"; do printf "%s->%s " $key ${aa_original[$key]}; done echo echo "~ aa:" for key in "${!aa[@]}"; do printf "%s->%s " $key ${aa[$key]}; done echo echo "~ leaving func() now" } declare -A AA=("a" "1" "b" "2") func AA echo "${AA["a"]} ~ ${AA["b"]}" 

The above was expected to display 3 ~ 4 at the end, but instead gives 1 ~ 2.

The reason is that the global AA gets mangled:

~ at the end of func() now ~ serialized: b "4" a "3" ~ aa_original: b->4 a->3 ~ aa: 0->( b->"4" a->"3" )-> b->2 a->1 ~ leaving func() now 

NOTE: As I have since discovered thanks to @pjh, this behavior is to be expected.

So I suppose I am left with using declare -A explicitly:

declare -A aa="( $serialized )" 

The assignment works, but it loses the reference attribute, clearly:

~ at the end of func() now ~ serialized: b "4" a "3" ~ aa_original: b->4 a->3 ~ aa: b->4 a->3 ~ leaving func() now 1 ~ 2 

Why does this happen?


Since I posted my question, I learned (thanks again, @pjh) that using eval on the assignment (which also works) has the same side-effect as using declare (see my own answer below).

So I am still with the original question how to process the serialized output of ${aa[@]@K} without eval effect.

I know I can use different serialization method, but I wonder about the specific output as-is.


For having it on record stand out, this works:

#!/bin/bash func() { declare -n aa=$1 serialized=$(declare -A aa_original=("a" "3" "b" "4"); printf '%s' "${aa_original[@]@K}") #create local interim aarray declare -A aa_local="( $serialized )" # blank out and refill the referenced aaray aa=() for key in "${!aa_local[@]}"; do aa["$key"]="${aa_local[$key]}"; done } declare -A AA=("a" "1" "b" "2") func AA echo "${AA["a"]} ~ ${AA["b"]}" 

As I have learned from @pjh, it is as "bad" as using eval, so without the extra step this is equivalent to:

#!/bin/bash func() { declare -n aa=$1 serialized=$(declare -A aa_original=("a" "3" "b" "4"); printf '%s' "${aa_original[@]@K}") aa=() eval aa="( $serialized )" } declare -A AA=("a" "1" "b" "2") func AA echo "${AA["a"]} ~ ${AA["b"]}" 

This is as simple as it gets already (without double assignment), but unfortunately performs an eval. It supports the output of ${aa_original[@]@K} including $'...' well though.

I kind of hoped this would have been possible without eval and I did not realize that declare -A already had that effect, but it does not appear so.

11

2 Answers 2

1

UPDATED based on OP's latest comments ...

Assumptions:

  • we start with an associative array AA[]
  • we receive a 'serialized' string containing new index/value pairs in the format k1 v1 k2 v2 where k# and v# could be wrapped in double quotes
  • we are to repopulate AA[] with the contents of the 'serialized' string => AA[k1]=v1 AA[k2]=v2

NOTE: we'll make use of the fact that xargs treats the contents between a pair of double quotes as a single field

One approach:

func() { declare -n aa="$1" # named ref for array to be repopulated local ss="$2" # serialized string aa=() # wipe array while read -r key # read current line of input do read -r val # read next line of input aa[$key]="$val" # repopulate array done < <(echo "${ss}" | xargs -n1) # let xargs split ss into one item per line } 

Utility function to display contents of an array:

print_arr() { declare -n arr="$1" local arr_name="$1" while read -r key do echo "${arr_name}[$key] = ${arr[$key]}" done < <(printf "%s\n" "${!arr[@]}" | sort) } 

Taking for a test drive:

declare -A AA for sstring in '"a" "3" "b" "4"' '"a to z" "22" "1 thru 3" "33"' 'c 5 d "1 2 3"' do printf "\n############### serialized string: '%s'\n" "${sstring}" printf "\n####### before\n" AA=("a" "1" "b" "2") print_arr AA printf "####### after\n" func AA "${sstring}" print_arr AA done 

This generates:

############### serialized string: '"a" "3" "b" "4"' ####### before AA[a] = 1 AA[b] = 2 ####### after AA[a] = 3 AA[b] = 4 ############### serialized string: '"a to z" "22" "1 thru 3" "33"' ####### before AA[a] = 1 AA[b] = 2 ####### after AA[1 thru 3] = 33 AA[a to z] = 22 ############### serialized string: 'c 5 d "1 2 3"' ####### before AA[a] = 1 AA[b] = 2 ####### after AA[c] = 5 AA[d] = 1 2 3 
Sign up to request clarification or add additional context in comments.

7 Comments

if you don't have aa_local then how are you getting the (serialized) string into the function? or is that also part of your question? as for the updating vs wiping/rebuilding AA[], consider updating the question with a couple examples of AA[] before and after a couple different sets of (serialized) strings; the answer can probably be updated but the question needs a bit of clarification on where the (serialized) string is coming from and how to handle mismatches in index sets
I now edited my question, it was mentioned at the bottom, but anyhow rearranged and I now blank out the array first (so no worries about that part). The serialized string is shown in output ~ serialized: - it's just key value pairs k v k v ... that one would normally use to assign to an associative array. What @K does is also takes care of escaping, so if need be, it can end up as $'...'. I am mostly after why my assignment aa="( $serialized )" does not work properly.
@AlbertCamu updated with what I think you're looking for
what you're describing is a chameleon question ... changing/adding-to requirements after receiving answer(s) ... and these are frowned upon in SO; you've now thrown in the point that your 'serialized' string could contain embedded linefeeds; at this point I'd suggest asking a new question where you provide several samples of your raw inputs, a textual description of what you need to do, and the expected results ... making sure your sample inputs are a true representation of your actual data
placing additional requirements in comments is also frowned upon since not everyone is going to read through the comments to come up with a complete picture of the requirements; the question should contain all of these details, to include clear, precise examples of a realistic data set
|
-1

The answer from @markp-fuso cannot account for possible escaped output of ${aa[@]@K} which takes the form of $'...' as it is virtually impossible to do so reliably without eval. With no use of eval as hard requirement, the only other sensible way to actually support escaping is to implement custom serialization which uses non-space characters as deliminers, like so:

ser_aa() { declare -n aa_ref="$1" for key in "${!aa_ref[@]}"; do printf "%q\31%q\30" "$key" "${aa_ref[$key]}" done } 

ASCII character 30 is for "record separator" and 31 for "unit separator". The deserialization then can be done without xargs:

func() { declare -n aa_ref="$1" declare sstring="$2" declare key value while IFS=$'\31' read -r -d $'\30' key value; do aa_ref+=(["$key"]="$value") done <<< "$sstring" } 

Then, sstring would have needed to be created, e.g.:

sstring=$(ser_aa aa_original) 

Input that needs escaping is well supported:

declare -A aa_original=(["a"]=3 ["b"]='4 $hello #hey') 

And will end up correctly stored as:

AA[b] = $'4\n$hello\n#hey' 

This approach avoids eval for safety, but one then needs control over how the output was produced.

1 Comment

With declare -A aa_original=("a" "3" "b" "4+$hello+here") the $here is expanded immediately. To prevent expansion use declare -A aa_original=("a" "3" "b" '4+$hello+here'). The literal string $here will then be unchanged when setting the values of serialized and aa. (I've replaced newlines with + characters for this comment, but that makes no difference to the fundamental point.)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.