Skip to main content
Corrected terminology of \K assertion
Source Link
jimmij
  • 48.7k
  • 20
  • 136
  • 141

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Short explanation:

  • -c option of grep counts the matching lines
  • -P enables perl regular expresions syntax
  • -o matches only part of lines
  • (?= construct is the so called positive look-ahead (take it as pattern, but do not include to the output)
  • \K is look-behind assertion (take whole pattern, but throw away from result everything up to this point)

The rest should be obvious. Be aware however that I've assumed there are no whitespaces in file names!

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Short explanation:

  • -c option of grep counts the matching lines
  • -P enables perl regular expresions syntax
  • -o matches only part of lines
  • (?= construct is the so called positive look-ahead (take it as pattern, but do not include to the output)
  • \K is look-behind assertion (take whole pattern, but throw away from result everything up to this point)

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Short explanation:

  • -c option of grep counts the matching lines
  • -P enables perl regular expresions syntax
  • -o matches only part of lines
  • (?= construct is the so called positive look-ahead (take it as pattern, but do not include to the output)
  • \K is look-behind assertion (take whole pattern, but throw away from result everything up to this point)

The rest should be obvious. Be aware however that I've assumed there are no whitespaces in file names!

Corrected terminology of \K assertion
Source Link
jimmij
  • 48.7k
  • 20
  • 136
  • 141

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Short explanation:

  • -c option of grep counts the matching lines
  • -P enables perl regular expresions syntax
  • -o matches only part of lines
  • (?= construct is the so called positive look-ahead (take it as pattern, but do not include to the output)
  • \K is negative look-behind assertion (take whole pattern, but throw away from result everything up to this point)

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Short explanation:

  • -c option of grep counts the matching lines
  • -P enables perl regular expresions syntax
  • -o matches only part of lines
  • (?= construct is the so called positive look-ahead
  • \K is negative look-behind

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Short explanation:

  • -c option of grep counts the matching lines
  • -P enables perl regular expresions syntax
  • -o matches only part of lines
  • (?= construct is the so called positive look-ahead (take it as pattern, but do not include to the output)
  • \K is look-behind assertion (take whole pattern, but throw away from result everything up to this point)
added 246 characters in body
Source Link
jimmij
  • 48.7k
  • 20
  • 136
  • 141

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Short explanation:

  • -c option of grep counts the matching lines
  • -P enables perl regular expresions syntax
  • -o matches only part of lines
  • (?= construct is the so called positive look-ahead
  • \K is negative look-behind

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Try the following script:

#!/bin/bash logfile="$1" nfiles=$(grep -c 'checking file' "$logfile") failed_userid=($(grep -oP 'failed reading user id: \K[^ ]*' "$logfile")) corrupted_files=($(grep -oP '[^ ]*(?= is corrupt)' "$logfile")) echo "Total Number of Files Scanned - $nfiles" echo "Total Number of Unique User ID failed - ${#failed_userid[@]}" echo "Total Number of Files Corrupted - ${#corrupted_files[@]}" echo echo "List of Unique User Id's which are corrupt - " for uid in "${failed_userid[@]}"; do echo "$uid" done echo echo "Files which are corrupted - " for corf in "${corrupted_files[@]}"; do echo "$corf" done 

Run it with

$ ./script file.log 

The result for input from your question looks like

Total Number of Files Scanned - 3 Total Number of Unique User ID failed - 3 Total Number of Files Corrupted - 1 List of Unique User Id's which are corrupt - 18446744073135142816 18446744073698151136 18446744072929739296 Files which are corrupted - /database/batch/p1_snapshot//p1_weekly_1980_0_200003_5.data 

Short explanation:

  • -c option of grep counts the matching lines
  • -P enables perl regular expresions syntax
  • -o matches only part of lines
  • (?= construct is the so called positive look-ahead
  • \K is negative look-behind
Source Link
jimmij
  • 48.7k
  • 20
  • 136
  • 141
Loading