2

I have a log file with a structure that follows:

  1. Timestamp header
  2. Log line

An example of such file as follows:

Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: positionX: 141 positionY: 39 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: positionX: 141 positionY: 39 

I would like to remove all timestamps except the ones followed by ---

--- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 

I have tried this:

cat logFile.txt | awk -F':' '/$2=="---"/ {next; print $0; continue}; !/^FINE/ {next}; {print}' 

without success.

I am using FreeBSD 12.1 (csh, but I guess that is irrelevant as the right tool for this is awk, please correct me if I am wrong).

2
  • You say "except the ones followed by ---", but your example seemingly shows that you are keeping the timestamps that follow a line containing ---. Am I wrong? Commented Jun 3, 2020 at 9:24
  • Right, the idea is to reduce the timestamps in the log to just one per block (which are marked with "---") to be able to read something. Otherwise it is impossible to follow the log file Commented Jun 3, 2020 at 9:25

4 Answers 4

2

sed can do it :

$ sed -nE '/---/{s/.*: (-*)/\1/;N;p}; /FINE/p' file --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 

If /---/ pattern matches :

s/.*: (-*)/\1/ => Replace "FINE: ---" with "---"
N;p => Append the current line with next line ( i.e) append next datestamp line )

/FINE/p => All other lines with FINE will be printed.

Another awk :

$ awk '/---/ { getline h;$0="---\n"h;print } /^FINE/{print }' file 
5
  • nice... code golf with GNU sed: sed -nE '/FINE/{s/.*: -/-/; Ta; N; :a p}' Commented Jun 3, 2020 at 9:49
  • cool ... with brach... Commented Jun 3, 2020 at 10:01
  • See awk.freeshell.org/AllAboutGetline for issues with the awk getline solution. Commented Jun 3, 2020 at 20:55
  • @EdMorton : Thanks to you , will go through that... but for this solution its not pretty bad i guess... Commented Jun 3, 2020 at 21:01
  • Depends if getline fails and/or whether there's a line after --- at the end of the input (in which case your script would silently print the PREVIOUS timestamp) and/or whether or not you want to do anything extra/different with the code (e.g. as an extremely basic example try adding a debugging print to display every line as it's read - without getline you just stick {print} at the front of the script which is a complete no-brainer, with it you have to think about it and add multiple prints). Using getline is one of those things that only looks simple but is actually a can of worms Commented Jun 3, 2020 at 21:07
1
$ awk 'p=="FINE: ---" && !/FINE/; {p=$0} /FINE/{sub(/FINE: -/, "-"); print}' ip.txt --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 
  • {p=$0} this saves the current line in p variable
  • p=="FINE: ---" && !/FINE/ this checks if previous line was FINE: --- and the current line doesn't match FINE. If the condition is satisfied, current line will be printed
  • /FINE/{sub(/FINE: -/, "-"); print} if current line contains FINE, the sub function will remove FINE: from FINE: --- lines. Then print the line

You can also use:

awk 'p; {p=0} /FINE/{if($2=="---") {print $2; p=1} else print}' # or using getline, assuming no errors awk '/FINE/{if($2=="---") {print $2; getline} print}' 
1
$ awk '/^FINE:/{print (/---$/ ? $2 ORS p : $0)} {p=$0}' file --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 --- Jun 03, 2020 10:39:04 AM pacakge.subpackage.Class method FINE: index: 14 timestamp: 1,590,170,100 value: 6 FINE: delta totalA: 0 total: 5 totalA/total: 0 totalA/total/deltaC.length: 0 FINE: bA index: 0 p: 294,325 b: 0 bb: 5 a: 0 aa: 0 total: 5 FINE: positionX: 141 positionY: 39 
0

This was suggested in freenode, and I think it is elegant and easier to read:

awk '$1 != "FINE:" {h=$0} $2 == "---" {printf "---\n%s\n", h} $1 == "FINE:" && $2 != "---"' 
3
  • That implies you posted the same question to multiple sites - please don't multi-post. That solution tests for FINE twice and tests for --- twice - duplicating code, including testing for a and also !a, is far from elegant. Commented Jun 3, 2020 at 20:48
  • Thanks for your comment Ed. Just let you know that sharing stackexchange posts in freenode is not multi-post as you do not post anything in freenode, it does not contravene any stack exchange rules either. Regarding the solution is easy to read and understand, which is an important metric IMHO. However, 'elegant' was clearly not the right term to use as it is clearly opinion-based. Commented Jun 3, 2020 at 21:46
  • Multi-posting and cross-posting have been frowned up since WAY before stackexchange was a twinkle in anyones eye and doesn't just apply to posting in forums run by one company, the stackexchange link I provided was just the first reference I came across when I googled, here's another that's more general: en.wikipedia.org/wiki/Crossposting (en.wikipedia.org/wiki/Multiposting redirects to it). Commented Jun 3, 2020 at 21:56

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.