0

Assume I have a file in format:

(lots of sections...) SECTION foo keyA=valA ... ENDSECTION (lots of sections...) 

I could extract the section with sed using sed -nE /^SECTION foo$/,/^ENDSECTION$/p. My problem arises when the format changes to:

(lots of sections...) SECTION keyA=valA id=foo ... ENDSECTION (lots of sections...) 

Is there a way to get the section by using a one-liner with the common tools (grep, sed, awk, perl, bash)?

2
  • Are your sections always separated by blank lines? Commented May 7, 2020 at 10:14
  • Not necessarily, it's not file-specific but more of general question because I run into that kind of task quite often and usually hack my way around it. Commented May 7, 2020 at 10:17

1 Answer 1

1

It somewhat depends on how "well-formed" your document is. If you are sure that every SECTION has a matching ENDSECTION, then the following awk code should work, although I wouldn't call it a "one-liner" (and I'm not sure that is possible in the first place).

awk -v pat='id=foo' '/^SECTION/{n=f=0; delete buf;} \ {buf[++n]=$0; if (index($0,pat)) f=1} \ /^ENDSECTION/ {if (f) for (i=1;i<=n;i++) print buf[i]}' input.txt 
  • The search pattern will be whatever you assign pat (but in the current form, it only performs a fixed-string search, not a regexp-based one).
  • When a line starting with SECTION (or whatever the keyword is) is found, a flag f(for "found") and a line counter n is initialized to zero. Also, since we need to buffer the entire section as the decision whether to print can only be made once the section is complete, we clear a buffer variable buf.
  • For any line (including empty ones in this case), the line counter n is increased, and the line added to the array variable buf. If the pattern is found on the line, the f flag is set to 1.
  • If the ENDSECTION keyword is found at start of line, and the f flag is 1, the buffer is printed line by line to output the relevant section.

If you could guarantee that all sections are separated by empty lines, this answer could point you in the right direction on how to do it with less code, but if that is not guaranteed, the "paragraph mode" approach of awk will not work.

Also note that this suggestion is very much the bare minimum necessary to fulfill the task; if you want to skip empty lines or fulfill other sanity check requirements, the code quickly gets rather long.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.