How do I filter certain lines except the ones following a pattern in AWK

Question

I have a log file with a structure that follows:

Timestamp header
Log line

An example of such file as follows:

Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: positionX: 141 positionY: 39 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: positionX: 141 positionY: 39

I would like to remove all timestamps except the ones followed by ---

--- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39

I have tried this:

cat logFile.txt | awk -F':' '/$2=="---"/ {next; print $0; continue}; !/^FINE/ {next}; {print}'

without success.

I am using FreeBSD 12.1 (csh, but I guess that is irrelevant as the right tool for this is awk, please correct me if I am wrong).

You say "except the ones followed by ---", but your example seemingly shows that you are keeping the timestamps that follow a line containing ---. Am I wrong? — fra-san
– fra-san, Commented Jun 3, 2020 at 9:24
Right, the idea is to reduce the timestamps in the log to just one per block (which are marked with "---") to be able to read something. Otherwise it is impossible to follow the log file — M.E.
– M.E., Commented Jun 3, 2020 at 9:25

Stalin Vignesh Kumar · Accepted Answer · 2020-06-03 10:15:35Z

sed can do it :

$ sed -nE '/---/{s/.*: (-*)/\1/;N;p}; /FINE/p' file --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39

If /---/ pattern matches :

s/.*: (-*)/\1/ => Replace "FINE: ---" with "---"
N;p => Append the current line with next line ( i.e) append next datestamp line )

/FINE/p => All other lines with FINE will be printed.

Another awk :

$ awk '/---/ { getline h;$0="---\n"h;print } /^FINE/{print }' file

nice... code golf with GNU sed: sed -nE '/FINE/{s/.*: -/-/; Ta; N; :a p}' — Sundeep
– Sundeep, Commented Jun 3, 2020 at 9:49
See awk.freeshell.org/AllAboutGetline for issues with the awk getline solution. — Ed Morton
– Ed Morton, Commented Jun 3, 2020 at 20:55
@EdMorton : Thanks to you , will go through that... but for this solution its not pretty bad i guess... — Stalin Vignesh Kumar
– Stalin Vignesh Kumar, Commented Jun 3, 2020 at 21:01
Depends if getline fails and/or whether there's a line after --- at the end of the input (in which case your script would silently print the PREVIOUS timestamp) and/or whether or not you want to do anything extra/different with the code (e.g. as an extremely basic example try adding a debugging print to display every line as it's read - without getline you just stick {print} at the front of the script which is a complete no-brainer, with it you have to think about it and add multiple prints). Using getline is one of those things that only looks simple but is actually a can of worms — Ed Morton
– Ed Morton, Commented Jun 3, 2020 at 21:07

Sundeep · Accepted Answer · 2020-06-03 10:10:35Z

$ awk 'p=="FINE: ---" && !/FINE/; {p=$0} /FINE/{sub(/FINE: -/, "-"); print}' ip.txt --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39

{p=$0} this saves the current line in p variable
p=="FINE: ---" && !/FINE/ this checks if previous line was FINE: --- and the current line doesn't match FINE. If the condition is satisfied, current line will be printed
/FINE/{sub(/FINE: -/, "-"); print} if current line contains FINE, the sub function will remove FINE: from FINE: --- lines. Then print the line

You can also use:

awk 'p; {p=0} /FINE/{if($2=="---") {print $2; p=1} else print}' # or using getline, assuming no errors awk '/FINE/{if($2=="---") {print $2; getline} print}'

Ed Morton · Accepted Answer · 2020-06-03 20:53:16Z

$ awk '/^FINE:/{print (/---$/ ? $2 ORS p : $0)} {p=$0}' file --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39

M.E. · Accepted Answer · 2020-06-03 09:26:37Z

0

This was suggested in freenode, and I think it is elegant and easier to read:

awk '$1 != "FINE:" {h=$0} $2 == "---" {printf "---\n%s\n", h} $1 == "FINE:" && $2 != "---"'

answered Jun 3, 2020 at 9:26

M.E.

6311 gold badge6 silver badges16 bronze badges

That implies you posted the same question to multiple sites - please don't multi-post. That solution tests for FINE twice and tests for --- twice - duplicating code, including testing for a and also !a, is far from elegant.

Ed Morton
– Ed Morton

2020-06-03 20:48:49 +00:00
Commented Jun 3, 2020 at 20:48
Thanks for your comment Ed. Just let you know that sharing stackexchange posts in freenode is not multi-post as you do not post anything in freenode, it does not contravene any stack exchange rules either. Regarding the solution is easy to read and understand, which is an important metric IMHO. However, 'elegant' was clearly not the right term to use as it is clearly opinion-based.

M.E.
– M.E.

2020-06-03 21:46:58 +00:00
Commented Jun 3, 2020 at 21:46
Multi-posting and cross-posting have been frowned up since WAY before stackexchange was a twinkle in anyones eye and doesn't just apply to posting in forums run by one company, the stackexchange link I provided was just the first reference I came across when I googled, here's another that's more general: en.wikipedia.org/wiki/Crossposting (en.wikipedia.org/wiki/Multiposting redirects to it).

Ed Morton
– Ed Morton

2020-06-03 21:56:00 +00:00
Commented Jun 3, 2020 at 21:56

Add a comment |

Stack Exchange Network

How do I filter certain lines except the ones following a pattern in AWK

4 Answers 4

You must log in to answer this question.

Hot Network Questions

How do I filter certain lines except the ones following a pattern in AWK

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions