2

After running my AWK script

awk -i inplace '(NR==FNR){a[$1];next} (FNR in a) && gsub(/\<Source Term\>/,"& Target Term") 1 ' <(shuf -n 198058 -i 1-$(wc -l < file)) file 

After I checked the file with the command

wc -l file 

I've noticed that my number of lines from file increased from 40058 to 44156. Is there a reason for that?

Is there any way to keep the number of lines as it was before?

2
  • 1
    Just a few tips for future debugging: instead of running -i inplace which would change the file, start by running the command to the terminal and see what it does. It might make sense to use a test file with a smaller subset just for the sake of the test. For instance, create a file with five lines, and lower the -n value of your shuf command to 3. This would make it easier for you to see the actual output of your awk script, and to make changes and see how it affects the output. Commented Jan 31, 2022 at 9:39
  • Thanks for your tip. I didn't changed the -i inplace and the -n, because this was more a theoretical question. I will consider it in my next question. Commented Jan 31, 2022 at 10:08

1 Answer 1

7

Whenever

(FNR in a) && gsub(/\<Source Term\>/,"& Target Term") 

evaluates to non-zero, i.e. FNR is in a and the substitution replaces at least one substring, the current line is output, because the default action is { print }.

Then

1 

causes the current line to be output in all cases (relying on the same default action).

This means that lines where substrings are substituted are output twice.

Placing the gsub invocation in a block will avoid this:

FNR in a { gsub(/\<Source Term\>/,"& Target Term") } 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.