4

I have the file:

key value blah blah blah blah blahblah man1 boy1 blah blah man1 boy2 man1 boy1 

I do this to remove duplicate lines:

awk '/man1/ { print $1,$2} ' file | awk '!x[$0]++' 

and the command take the first line and ignore other lines

man1 boy1 man1 boy2 

but I want to ignore all lines except the last line:

man1 boy2 man1 boy1 

as ramesh said I want something like:

cat filename blah blah blah blah blahblah man1 boy1 blah blah man1 boy2 man1 boy1 man1 boy2 man1 boy3 man1 boy4 man1 boy2 

the desired output

man1 boy1 man1 boy3 man1 boy4 man1 boy2 
6
  • 3
    Please clarify why the last line is required. Is it because it's adjacent to a similar one? I'm not sure I follow your logic. Commented Aug 13, 2014 at 20:23
  • I want it to be like this so I know boy1 is the last value man1 take it,see updates Commented Aug 13, 2014 at 20:24
  • 1
    Is it relevant that blah blah is duplicated 3 times? You should really clarify what do you want and provide a better example. Commented Aug 13, 2014 at 20:35
  • @CristianCiupitu, no blah blah is a not wanted text see updates Commented Aug 13, 2014 at 20:54
  • 1
    Is this some type of homework problem? Commented Aug 13, 2014 at 22:25

6 Answers 6

8
tac filename |awk '/man1/ { print $1,$2} '| awk '!x[$0]++' | tac 

Testing

I wanted to test with more concrete input. So, my testing is as below.

cat filename blah blah blah blah blahblah man1 boy1 blah blah man1 boy2 man1 boy1 man1 boy2 man1 boy3 man1 boy4 man1 boy2 

Now, I run the above command and get the output as,

tac filename |awk '/man1/ { print $1,$2} '| awk '!x[$0]++' | tac man1 boy1 man1 boy3 man1 boy4 man1 boy2 

As per Steeldriver's suggestion, we could modify the awk to be more simpler as,

tac filename | awk '/^man1/ && !x[$2]++' | tac 
4
  • +1 good one, but I want to ask you is tac will be bad for performance for large files, I remove the duplicates because I don't want bad performance. Commented Aug 13, 2014 at 20:33
  • 1
    tac is actually a lazy solution :) Commented Aug 13, 2014 at 20:33
  • 1
    Do you really need two awks? What about tac file | awk '/^man1/ && !x[$2]++' | tac Commented Aug 13, 2014 at 21:31
  • @steeldriver, thanks. I will check out this option as soon as I get a Linux box and update the answer. Thanks again. Commented Aug 13, 2014 at 21:34
7

you can do this using this shell script:

#!/bin/bash awk '/man1/{pos[$0] = NR} END { for(key in pos) reverse[pos[key]] = key for(nr=1;nr<=NR;nr++) if(nr in reverse) print reverse[nr] }' yourfile 

Output:

[root@host ~]# sh shell.sh man1 boy1 man1 boy3 man1 boy4 man1 boy2 

Source

0
4

With zsh:

$ printf '%s\n' ${(Oau)${(MOa)${(f)"$(<file)"}:#man1*}} man1 boy1 man1 boy3 man1 boy4 man1 boy2 

Those are parameter expansion flags:

  • f: split on newline
  • ${(M)array:#pattern}: expands to the elements matching the pattern
  • Oa: reverse the order of array
  • u: unique
0
3

A GNU awk specific solution:

gawk ' $1 == "man1" { # remember the last line number x[$0] = NR } END { # traverse the array by sorted numeric values PROCINFO["sorted_in"] = "@val_num_asc" for (line in x) print line } ' file 
man1 boy1 man1 boy3 man1 boy4 man1 boy2 

As a terse one-liner:

gawk '/man1/{x[$0]=NR}END{PROCINFO["sorted_in"]="@val_num_asc";for(l in x)print l}' file 

References:
http://www.gnu.org/software/gawk/manual/html_node/Array-Sorting-Functions.html#Array-Sorting-Functions
http://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html#Controlling-Scanning

2

A two pass solution. In the first pass capture the record numbers of the last record for each key into an array. In the second pass, print if record number exists in the array

awk 'NR == FNR{if ($0 ~ /man/)x[$0]=NR; next}; FNR == 1{for (k in x) y[x[k]]=k}; (FNR in y)' file file man1 boy1 man1 boy3 man1 boy4 man1 boy2 
2

Remember to mention the file name twice

awk '!/man1/{next}; NR == FNR {a[$0]++; next}; ++b[$0] == a[$0]' file file 
1
  • $ cat sam2 blah blah blah blah blahblah man1 boy1 blah blah man1 boy2 man1 boy1 man1 boy2 man1 boy3 man1 boy4 man1 boy2 $ awk 'NR == FNR && /man1/ {a[$0]++; next} NR != FNR && ++b[$0] == a[$0]' sam2 sam2 man1 boy1 man1 boy3 man1 boy4 man1 boy2 $ Commented Aug 14, 2014 at 10:44

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.