Skip to main content
added 75 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60

With GNU awk for ENDFILE and IGNORECASE:

$ awk -v IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

See https://unix.stackexchange.com/a/642372/133219 for an answer to the followup question of also counting vowels.

With GNU awk for ENDFILE and IGNORECASE:

$ awk -v IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

With GNU awk for ENDFILE and IGNORECASE:

$ awk -v IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

See https://unix.stackexchange.com/a/642372/133219 for an answer to the followup question of also counting vowels.

edited body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60

With GNU awk for ENDFILE and IGNORECASE:

$ awk -iv IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

With GNU awk for ENDFILE and IGNORECASE:

$ awk -i IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

With GNU awk for ENDFILE and IGNORECASE:

$ awk -v IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

added 9 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60

With GNU awk for ENDFILE and IGNORECASE:

$ awk -i IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

With GNU awk for ENDFILE and IGNORECASE:

$ awk -i IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

With GNU awk for ENDFILE and IGNORECASE:

$ awk -i IGNORECASE=1 ' { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )} ENDFILE { print FILENAME, cnt+0; cnt=0 } ' file1 file2 file1 12 file2 7 

or with any POSIX awk:

$ awk ' { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) } END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 } ' file1 file2 file1 12 file2 7 

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

added 234 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading
added 6 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading
deleted 486 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading
added 402 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading
added 402 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading
added 451 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading
added 3 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading