0

Need to find the first instance of a string in a yaml file and insert it alond with a couple new lines after a specific line. Trying to find the first instance of 'rtr-*' and copy it out with a couple additional lines to a specific section in these yaml files. I have about 200 files to edit.

Current State:

abc_123: hosts: rtr-abc-123.us.dom: sw-abc-123-01.us.dom: sw-abc-123-02.us.dom: vars: snmp_location: "123 Main Street, Nowhere, IL 12345" terminals: children: abc_123: 

Desired State:

abc_123: hosts: rtr-abc-123.us.dom: sw-abc-123-01.us.dom: sw-abc-123-02.us.dom: vars: snmp_location: "123 Main Street, Nowhere, IL 12345" routers: hosts: rtr-abc-123.us.dom: terminals: children: abc_123: 

Something like... ?:

sed '/(rtr-)/, {/snmp_location/a
\n
routers:
hosts:
\1
\n }' ./
.yml

I've tried several variations all ending in errors or no change. Been a long time since I've used sed. Any help is more than appreciated.

1
  • 1
    Please edit your question and tell us your operating system or, at least, your sed implementation. There are many, and they behave differently. Is sed an absolute requirement? It isn't the best choice for multi-line stuff. Also, you say you need to insert at a specific line. What line? After the first occurrence of 123? Of snmp_location:? Of vars:? Can we have multiple values under vars:? Commented Nov 12 at 17:19

3 Answers 3

2

To store things, sed has a hold space and h/H commands to copy/append the pattern space to the host space (hold), g/G to get what's held and replace/append to the pattern space (and x to exchange hold and pattern space).

sed -E ' /^ {4}rtr-/h /snmp_location/ { s/$/\ \ routers:\ hosts:/ G }' 

Would hold any line that starts with 4 spaces followed by rtr-, and for any line that contains snmp_location, would append those 3 extra literal lines (by substituting the end of it ($) with them), plus the contents of the hold space.

In a less antiquated language, the equivalent would look like:

perl -pe ' $hold = $_ if /^ {4}rtr-/; $_ .= "\nrouters:\n hosts:\n$hold" if /snmp_location/' 

Or to avoid adding that routers section if rtr- was not found:

perl -pe ' $hold = $_ if /^ {4}rtr-/; $_ .= "\nrouters:\n hosts:\n$hold" if /snmp_location/ && defined $hold' 

In sed:

sed -E ' /^ {4}rtr-/h /snmp_location/ { G s/(\n)(.)/\1\1routers:\1 hosts:&/; t s/\n// }' 

Where we first append the hold space to the pattern space for lines that contain snmp_location, and only insert the routers section before what was appended if non-empty (matches .). Also note the then command that branches to the end if the substitution succeeded.

Note that if there are several occurrences of rtr- before the snmp_location line, all those will consider the last, not first. In perl you can change $host = to $host //= to skip the assignment if $hold is already defined. In sed, you could change it to:

sed -E ' /^ {4}rtr-/ { x /./ { # already found x b after_g } g : after_g } /snmp_location/ { G s/(\n)(.)/\1\1routers:\1 hosts:&/; t s/\n// }' 

Or, assuming the first occurrence of rtr- won't be on the first line (with GNU sed, you could replace 1 with 0 to cover for that case):

sed -E ' 1, /^ {4}rtr-/ { /^ {4}rtr-/ h } /snmp_location/ { G s/(\n)(.)/\1\1routers:\1 hosts:&/; t s/\n// }' 
2
  • these both work as well when 'rtr-xyz' is present. When a file does not contain 'rtr-xyz' both solutions still insert the 'routers:' and 'hosts:' lines. I've tried ' undef $hold at the end but then nothing gets matched. Commented Nov 12 at 20:31
  • @BrianK. see edit. Commented Nov 13 at 6:01
0

sed processes the input line by line. You can use the hold space and pattern space to hold additional data, but it's far from readable.

Perl seems to be a better tool for the job:

perl -pe ' $r = $1 if ! defined $r && /\s(rtr-[^:]+)/; $vars = 1 if /^\s*vars:/; $vars = 2, print "\nrouters:\n hosts:\n $r:\n" if defined $r && 1 == $vars && /^$/; undef $r, undef $vars if eof; ' -- file1 file2 ... 
  • -p processes the input line by line, printing each line after processing;
  • $r is used to store the first instance of rtr-* (we only populate it if it hasn't been set yet);
  • $vars is set to 1 once we start processing the vars section;
  • The additional lines are printed when $vars is 1 (i.e. the vars section has started) and we process an empty line;
  • Once the section is printed, $vars is set to 2 which means no further sections will be printed;
  • The variables are reset when we reach the end of a file.

But even better would be to use a library that can read YAML files.

perl -MYAML::XS=LoadFile,Dump -e ' $y = LoadFile(shift); undef $y->{routers}{hosts}{ (grep /^rtr-/, keys %{ $y->{abc_123}{hosts} })[0] }; print Dump($y); ' "$file" 

It's not exactly correct as it doesn't find the first occurrence of rtr-*: they are keys, so they have no defined order.

9
  • I'm not married to sed, if perl / python whatever seem to be a more viable solution that's fine. @choroba, this is very close. I got way too many extra sections added though. Inserted the output 4 times instead of just once. Commented Nov 12 at 17:52
  • So your example input doesn't represent the actual input. Maybe there are several empty lines after vars:? You can try clearing the $vars flag after outputting the section (see update). Commented Nov 12 at 17:55
  • Actually, the example is an actual representation. I have +200 files to edit. Each has the same format. Listing the routers / switches at a site. Ansible inventory. Sites have an alpha-numeric code. abc-123, bcd-234, cdf-345. So rtr-abc-123, rtr-bcd-234, etc. Not every site has a router any longer. Those that do, I'd like to pull the router name and drop it in it's own group " routers:" Just as shown in the 'Desired' state. Does that make sense? Commented Nov 12 at 18:27
  • @BrianK.: Does the updated script work? Also, I tried to explain it, so you might be able to fix it yourself. Commented Nov 12 at 19:21
  • it does, except for files that do not have a router it is setting one from the first file that did. I think '$r' needs to be reset. But yes, very much closer! Commented Nov 12 at 20:00
0

For anything more than a simple s/old/new/ on a single line, an awk script will almost always be some combination of clearer, simpler, more robust, more maintainable, more portable, etc. than a sed script. For example, using any awk with the sample input from the question:

$ awk ' { print } /rtr-/ { rtr="\nrouters:\n hosts:\n" $0 ORS } /snmp_location/ { printf "%s", rtr } ' file abc_123: hosts: rtr-abc-123.us.dom: sw-abc-123-01.us.dom: sw-abc-123-02.us.dom: vars: snmp_location: "123 Main Street, Nowhere, IL 12345" routers: hosts: rtr-abc-123.us.dom: terminals: children: abc_123: 

Note that that won't insert anything if rtr isn't present in the input:

$ cat file2 abc_123: hosts: sw-abc-123-01.us.dom: sw-abc-123-02.us.dom: vars: snmp_location: "123 Main Street, Nowhere, IL 12345" terminals: children: abc_123: 

$ awk '{print} /rtr-/{rtr="\nrouters:\n hosts:\n" $0 ORS} /snmp_location/{printf "%s", rtr}' file2 abc_123: hosts: sw-abc-123-01.us.dom: sw-abc-123-02.us.dom: vars: snmp_location: "123 Main Street, Nowhere, IL 12345" terminals: children: abc_123: 

If you need to update the input file with the command output then either use GNU awk for -i inplace:

awk -i inplace '...' file 

or with any awk just redirect the output to a temp file then overwrite the input file with the temp file:

awk '...' file > tmp && mv tmp file 

Change the regexps /rtr-/ and /snmp_location/ however you like if those aren't robust enough, I just copied what you had in the question assuming it's good enough.

If you really care about only selecting the first occurrence of rtr- from the input, just change /rtr-/ to !rtr && /rtr-/. Any other such changes would be similarly trivial.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.