2

I have a word and I want to check what the percent of its appearance in a file ( according to the total number of word in the file ) ? For example if I have the word "you" and it appears 2 times in a file with 8 words the output will be 25%.

I tried: fgrep -ow

0

3 Answers 3

2

you can get the total numbers of words in your file as follow

nw=`wc -w < /path/to/file` 

And the number of occurrences of a certain word/pattern with

occurrences=`egrep -c <pattern> /path/to/file` 

then you can easily calculate the percentage and put the result in a variable

result=`echo "scale=2; $occurrences*100/$nw" | bc` 

to add the % you can eg. do as follow

echo $result'%' 
2
  • tnx!! but how i can sdd % near the result? Commented Nov 11, 2015 at 21:25
  • you welcome, it was funny to test, i ll update the answer ; ) Commented Nov 11, 2015 at 23:29
0

Use the same logic as shown URL

tr ' ' '\n' < file.txt | awk '{if($0=="her"){nmw+=1}}END{print ((nmw*100)/NR)}' 
3
  • Assumes all words are separated by spaces. Commented Nov 11, 2015 at 10:31
  • tnx, but for some reason it not working, it gives me 0 as output. what is "her" ? Commented Nov 11, 2015 at 10:35
  • replace her with string you want to search so for your case it is you. Commented Nov 11, 2015 at 11:29
0

With awk:

awk -vw="word" 'BEGIN{RS="[^a-zA-Z]+"} $0==w{c++} END{printf "%.1f%%\n",c*100/NR}' file 
  • -vw="word" gives awk the variable w which contains "word". That is the word, you want to have the percentage.
  • BEGIN{RS="[^a-zA-Z]+"} sets the row separator to everything, but letter, so every word is processed separately.
  • $0==w{c++} increase the counter if the word is found.
  • END{printf "%.1f%%\n",c*100/NR} print the calculated number after the file is processed

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.