Note: I have already asked a similar question in AWK: Quick way to insert target words after a source term and I am at the beginner level of AWK.
This question considers the insertion of multiple target terms after source terms in a number of random selected lines.
With this AWK code snippet
awk '(NR==FNR){a[$1];next} FNR in a { gsub(/\<source term\>/,"& target term") } 1 ' <(shuf -n 5 -i 1-$(wc -l < file)) file I want to insert a target term after the source term in 5 random lines of the file.
For example: I have a bilingual dictionary dict which contains the source terms on the left and the target terms on the right like
apple : Apfel banana : Banane raspberry : Himbeere My file consists of these lines:
I love the Raspberry Pi. The monkey loves eating a banana. Who wants an apple pi? Apple pen... pineapple pen... pen-pineapple-apple-pen! The banana is tasty and healthy. An apple a day keeps the doctor away. Which fruit is tastes better: raspberry or strawberry? Assuming for the first word apple the random lines 1, 3, 5, 4, 7 are selected. The output with the word apple will be like this:
I love the Raspberry Pi. The monkey loves eating a banana. Who wants an apple Apfel pi? Apple Apfel pen... pineapple pen... pen-pineapple-apple-pen! The banana is tasty and healthy. An apple a day keeps the doctor away. Which fruit is tastes better: raspberry or strawberry? then another 5 random lines; 3, 3, 5, 6, 7; for the word banana will be selected:
I love the Raspberry Pi . The monkey loves eating a banana . Who wants an apple Apfel pi ? Apple Apfel pen... pineapple pen... pen-pineapple-apple-pen! The banana Banane is tasty and healthy . An apple a day keeps the doctor away . Which fruit is tastes better: raspberry or strawberry? And the same goes on with all the other entries in dict until the last entry is matched.
I want to choose 5 random lines. If these lines have a whole source term like apple I only want to match Apfel to apple as whole word (terms like "pineapple" will be ignored). If a line contains a source term twice, like apple, than I want to insert the target term after it as well. Matches should be case-insensitive, so I can also match source terms like apple and Apple.
My question: How can I rewrite the code snippet above, so I can use a dictionary dict, which selects random lines in file and inserts target terms behind the source terms?