2

I want to extract relevant data of a traffic junction and it's connections from a log file. Example log:

SCN DD1251 At Glasgow Road - Kilbowie Road Modified By ________ Type CR Region WS Subregion UPSTREAM DOWNSTREAM FILTER NODE LINK NODE LINK LINK DD1271 C DD1271 R DD1351 D DD1351 B E Stage Suffix for Offset Optimizer 1 Double Cycle Initially ? N Force Single / Double Cycling status ? N Double Cycle Group 00 Double Cycle Ignore ? N Allow Link Max Saturation N Link Max Sat Override N Stages 1 2 3 4 Fixed N N N Y LRT stage N N N N Skip allowed N N N N Ped stage N N N N Ped invite N N N N Ghost stage N N N N Offset authority pointer 0 Split authority pointer 0 Offset opt emiss weight 000 I/green feedback inhibit N Bus Authority 00 ACIS node 00000 Bus Mode - Central extensions N Local extensions N Recalls N Stage skipping N Stage truncation N Cancels N Bus Priority Selection - Multiple buses N Queue Calculation N Hold recall if faulty N Disable recall N Disable long jtim N Real Cancel N Bus recall recovery type 0 Bus extension recovery type 0 Offset Bus authority pointer 0 Split Bus authority pointer 0 Bus skip recovery 0 Skip importance factor 0 Bus priority status OFF LRT sat 1 000 LRT sat 2 000 LRT sat 3 000 PEDESTRIAN FACILITIES Ped Node N Num Ped Wait Imp Factor 000 Ped Priority 0 Max Ped Priority Freq 00 Ped Lower Sat Threshold 000 Ped Upper Sat Threshold 000 Max Ped Wait Time 000 PEDESTRIAN VARIABLE INVITATION TO CROSS Allow Ped Invite N Ped Priority Auto 000 Ped Invite Upper Sat 000 Prio Level 1 2 3 4 Max Ped Priority Smoothed Time 000 000 000 000 Max Ped Priority Increase Length 00 00 00 00 CYCLE TIME FACILITIES Allow Node Independence N Operator Node Independence 0 Ghost Demand Stage N Num Ghost Assessment Cycles 15 Upper Trigger Ghost 04 Lower Trigger Ghost 0 SCN DD1271 At Glasgow Road - Hume Street Modified 13-OCT-15 15:06 By BDAVIDSON Type CR Region WS Subregion UPSTREAM DOWNSTREAM FILTER NODE LINK NODE LINK LINK DD1301 T DD1301 A DD1251 R DD1251 C Stage Suffix for Offset Optimizer 1 Double Cycle Initially ? N Force Single / Double Cycling status ? N Double Cycle Group 00 Double Cycle Ignore ? N Allow Link Max Saturation N Link Max Sat Override N Stages 1 2 3 Fixed N Y Y LRT stage N N N Skip allowed N N N Ped stage N N N Ped invite N N N Ghost stage N N N Offset authority pointer 0 Split authority pointer 0 Offset opt emiss weight 000 I/green feedback inhibit N Bus Authority 00 ACIS node 00000 Bus Mode - Central extensions N Local extensions N Recalls N Stage skipping N Stage truncation N Cancels N Bus Priority Selection - Multiple buses N Queue Calculation N Hold recall if faulty N Disable recall N Disable long jtim N Real Cancel N Bus recall recovery type 0 Bus extension recovery type 0 Offset Bus authority pointer 0 Split Bus authority pointer 0 Bus skip recovery 0 Skip importance factor 0 Bus priority status OFF LRT sat 1 000 LRT sat 2 000 LRT sat 3 000 PEDESTRIAN FACILITIES Ped Node N Num Ped Wait Imp Factor 000 Ped Priority 0 Max Ped Priority Freq 00 Ped Lower Sat Threshold 000 Ped Upper Sat Threshold 000 Max Ped Wait Time 000 PEDESTRIAN VARIABLE INVITATION TO CROSS Allow Ped Invite N Ped Priority Auto 000 Ped Invite Upper Sat 000 Prio Level 1 2 3 4 Max Ped Priority Smoothed Time 000 000 000 000 Max Ped Priority Increase Length 00 00 00 00 CYCLE TIME FACILITIES Allow Node Independence N Operator Node Independence 0 Ghost Demand Stage N Num Ghost Assessment Cycles 15 Upper Trigger Ghost 04 Lower Trigger Ghost 0 

I can already extract the first relevant line using the following Bash script:

grep SCN* LOG.TXT > JUNCTIONS.txt 

Which creates a list of all the junctions like so:

SCN DD1251 At Glasgow Road - Kilbowie Road SCN DD1271 At Glasgow Road - Hume Street SCN DD1301 At Glasgow Road - Argyll Road - Cart Street SCN DD1351 At Kilbowie Road - Chalmers Street ... 

However, I want to extract the lines immediately after each link title, down to the final link of the node just before a large amount of whitespace and without capturing anything from Stage Suffix onwards until the next link.

Is there a way to modify my BASH script to include an additional number of lines after each matching instance it finds?

1
  • Welcome to the site. Please indicate an example of the desired output for your input example so that contributors can exactly understand what you want to achieve. Also, please include whether the number of lines is fixed or variable, and if the "stopping pattern" is simply a certain number of empty lines (are they empty as in no characters, or only "visually empty" meaning they can contain whitespace/tabs). Commented May 19, 2021 at 14:44

2 Answers 2

2

Is this what you want?

sed -n '/^SCN/,/^\s*$/p' LOG.TXT 

It prints the lines between two patterns (and the lines that contain them):

  • 'SCN' appearing at the start of a line (^SCN)
  • and a blank line containing 0 or more white space characters (^\s*$)

and outputs the following:

SCN DD1251 At Glasgow Road - Kilbowie Road Modified By ________ Type CR Region WS Subregion UPSTREAM DOWNSTREAM FILTER NODE LINK NODE LINK LINK DD1271 C DD1271 R DD1351 D DD1351 B E SCN DD1271 At Glasgow Road - Hume Street Modified 13-OCT-15 15:06 By BDAVIDSON Type CR Region WS Subregion UPSTREAM DOWNSTREAM FILTER NODE LINK NODE LINK LINK DD1301 T DD1301 A DD1251 R DD1251 C 
1
  • Thanks @YetAnotherUser, I edited the answer accordingly Commented May 19, 2021 at 21:43
2

In your case, an awk program comes to mind:

awk '/^SCN/{f=1} !NF{f=0} f' LOG.TXT > JUNCTIONS.TXT 

This will

  • set a flag f to 1 when the line starts with SCN
  • reset the flag to 0 when an empty line is found (in this case, we allow "visually empty" lines by imposing the condition that no text fields are found, rather than that no characters at all are present)
  • print the current line only when the flag is 1

Somewhat simpler, using address ranges instead of explicit flags:

awk '/^SCN/,/^[[:space:]]*$/' LOG.TXT > JUNCTIONS.TXT 

The output for your example would be:

SCN DD1251 At Glasgow Road - Kilbowie Road Modified By ________ Type CR Region WS Subregion UPSTREAM DOWNSTREAM FILTER NODE LINK NODE LINK LINK DD1271 C DD1271 R DD1351 D DD1351 B E SCN DD1271 At Glasgow Road - Hume Street Modified 13-OCT-15 15:06 By BDAVIDSON Type CR Region WS Subregion UPSTREAM DOWNSTREAM FILTER NODE LINK NODE LINK LINK DD1301 T DD1301 A DD1251 R DD1251 C 

The first example would not print separating blank lines.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.