Show line numbers of a file after "cut -d ";" -f3 |uniq -d"

Question

I am trying to make a simple command that can show me the duplicate data from one specific column and also give me the original line number.

Example of file:

JENNIE;30;DOCTOR;F SARA;26;POLICE;F EDWARD;32;TEACHER;M ROBERT;44;POLICE;M

With the following command I will get the duplicates from column 3

cat FILE.txt |cut -d ";" -f3 |sort |uniq -d

The problem is that I need to get the original line number of the results.

My command shows:

POLICE POLICE

And I want to get

2- POLICE 4- POLICE

is the dash/hypen an important part of your output, or would you be happy with "2 POLICE" etc? — Jeff Schaller
– Jeff Schaller ♦, Commented Apr 26, 2019 at 17:39
Is that really the output you get? I just get a single POLICE, and only if I sort first. — jesse_b
– jesse_b, Commented Apr 26, 2019 at 17:42
Umm... Your command outputs nothing as there are no consecutive lines that contain duplicated data in the 3rd column. — Kusalananda
– Kusalananda ♦, Commented Apr 26, 2019 at 17:43

Stéphane Chazelas · Accepted Answer · 2019-04-26 18:02:04Z

With GNU sort and GNU uniq, you could do:

$ <FILE.txt awk -F';' '{print NR"- "$3}' | sort -st' ' -k2 | uniq -Df1 2- POLICE 4- POLICE

Lines are sorted first lexically on the text and then by number (-s preserves the original order for texts that sort the same). Add a | sort -n to sort by line number.

With awk alone:

awk -F';' '!x {c[$3]++}; x && c[$3] > 1 {print FNR"- "$3}' FILE.txt x=1 FILE.txt

jesse_b · Accepted Answer · 2019-04-26 18:39:23Z

It seems unlikely that your current pipeline works in the way you claim but it does not with BSD or GNU tools. Not sure if you are using something different.

I was able to come up with the following loop to to accomplish what you are asking:

for prof in $(cut -d\; -f3 FILE.txt | sort | uniq -d); do awk -v pat="$prof" -F\; '$3 ~ pat{print NR"-",$3}' FILE.txt done

This will produce a list of professions that appear more than once and then use awk to find each occurance of them in the file, printing the line number and profession name.

awk will set the profession gathered from the cut -d\; -f3 FILE.txt | sort | uniq -d pipeline to the pat parameter and then search the file for lines containing that pattern in the 3rd field (using ; as a field separator). For lines that match it will print the line number and the 3rd field (separated by a dash).

Wow, thats amazing but Im a dummy with awk, im not sure how its works haha — user350193
– user350193, Commented Apr 26, 2019 at 18:19

Stack Exchange Network

Show line numbers of a file after "cut -d ";" -f3 |uniq -d"

2 Answers 2

You must log in to answer this question.

Hot Network Questions

Show line numbers of a file after "cut -d ";" -f3 |uniq -d"

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions