Removing duplicate rows in vi?

Question

I have a text file that contains a long list of entries (one on each line). Some of these are duplicates, and I would like to know if it is possible (and if so, how) to remove any duplicates. I am interested in doing this from within vi/vim, if possible.

Looks like a duplicate of stackoverflow.com/questions/746689/… — Nathan Fellman
– Nathan Fellman, Commented Feb 18, 2010 at 21:01
This one is 1 year old; that one is 10 months. So, other way around. — Sydius
– Sydius, Commented Feb 26, 2010 at 19:50
@Sydius consensus now is to prioritize upvote count (which you also have more of): meta.stackexchange.com/questions/147643/… And those are not duplicates, that one does not mention Vim :-) — Ciro Santilli OurBigBook.com
– Ciro Santilli OurBigBook.com, Commented Aug 8, 2016 at 8:38

Brian Carper · Accepted Answer · 2008-12-08 22:32:20Z

439

If you're OK with sorting your file, you can use:

:sort u

answered Dec 8, 2008 at 22:32

Brian Carper

73.3k28 gold badges171 silver badges169 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

cryptic0 Over a year ago

If sorting is unacceptable, use :%!uniq to simply remove duplicate entries without sorting the file.

nilon Over a year ago

once you use the command the whole file changes? how do you go back? I already saved the file by mistake ... my bad

adampasz Over a year ago

Just use Vim's undo command: u

CervEd Over a year ago

@cryptic0, uniq won't work unless the duplicates are sorted a$b$a$ does nothing

Lorenz Leitner Over a year ago

You can select the lines you want sorted and deduplicated first with V or something similar, then issue the command.

|

Brad Koch · Accepted Answer · 2019-04-23 14:09:20Z

44

Try this:

:%s/^\(.*\)\(\n\1\)\+$/\1/

It searches for any line immediately followed by one or more copies of itself, and replaces it with a single copy.

Make a copy of your file though before you try it. It's untested.

edited Apr 23, 2019 at 14:09

Brad Koch

20.5k21 gold badges114 silver badges141 bronze badges

answered Dec 8, 2008 at 22:27

Sean

5,3346 gold badges30 silver badges27 bronze badges

7 Comments

Sean Over a year ago

@hop Thanks for testing it for me. I didn't have access to vim at the time.

ak85 Over a year ago

this hightlights all the duplicate lines for me but doesn't delete, am I missing a step here?

hippietrail Over a year ago

I'm pretty sure this will also highlight a line followed by a line that has the same "prefix" but is longer.

horta Over a year ago

The only issue with this is that if you have multiple duplicates (3 or more of the same lines), you have to run this many times until all dups are gone since this only removes them one set of dups at a time.

horta Over a year ago

Another drawback of this: this won't work unless your duplicate lines are already next to each other. Sorting first would be one way of ensuring they're next to each other. At that point, the other answers are probably better.

|

kenorb · Accepted Answer · 2015-09-23 10:36:27Z

32

From command line just do:

sort file | uniq > file.new

edited Sep 23, 2015 at 10:36

kenorb

169k95 gold badges712 silver badges796 bronze badges

answered Apr 11, 2011 at 16:31

Kevin

1,5693 gold badges21 silver badges31 bronze badges

5 Comments

Rafid Over a year ago

This was very handy for me for a huge file. Thanks!

TayTay Over a year ago

Couldn't get the accepted answer to work, as :sort u was hanging on my large file. This worked very quickly and perfectly. Thank you!

hippietrail Over a year ago

'uniq' is not recognized as an internal or external command, operable program or batch file.

DanM Over a year ago

Yes -- I tried this technique on a 2.3 GB file, and it was shockingly quick.

12431234123412341234123 Over a year ago

@hippietrail You are on windows PC? Maybe you can use cygwin.

Rovin Bhandari · Accepted Answer · 2016-08-04 12:38:40Z

15

awk '!x[$0]++' yourfile.txt if you want to preserve the order (i.e., sorting is not acceptable). In order to invoke it from vim, :! can be used.

answered Aug 4, 2016 at 12:38

Rovin Bhandari

4751 gold badge4 silver badges13 bronze badges

3 Comments

Cometsong Over a year ago

This is lovely! Not needing to sort is exactly what I was looking for!

CervEd Over a year ago

what does it do?

Billious Over a year ago

This can also be done in perl if it strikes your fancy perl -nle 'print unless $seen{$_}++' yourfile.txt

Jon DellOro · Accepted Answer · 2008-12-09 01:16:21Z

I would combine two of the answers above:

go to head of file sort the whole file remove duplicate entries with uniq 1G !Gsort 1G !Guniq

If you were interested in seeing how many duplicate lines were removed, use control-G before and after to check on the number of lines present in your buffer.

'uniq' is not recognized as an internal or external command, operable program or batch file.

Bridgey · Accepted Answer · 2009-11-01 18:23:37Z

6

g/^\(.*\)$\n\1/d

Works for me on Windows. Lines must be sorted first though.

answered Nov 1, 2009 at 18:23

Bridgey

5395 silver badges15 bronze badges

1 Comment

hippietrail Over a year ago

This will delete a line following a line which is it's prefix: aaaa followed by aaaabb will delete aaaa erroneously.

John Poulis · Accepted Answer · 2021-01-13 09:56:18Z

5

If you don't want to sort/uniq the entire file, you can select the lines you want to make uniq in visual mode and then simply: :sort u.

answered Jan 13, 2021 at 9:56

John Poulis

691 silver badge3 bronze badges

1 Comment

Billious Over a year ago

If you know the line numbers you want sorted to unique you can prefix the starting and ending line numbers, eg. if you want to sort+unique lines 5 through 10 the command would be :5,10 sort u

kenorb · Accepted Answer · 2015-09-23 10:37:19Z

4

Select the lines in visual-line mode (Shift+v), then :!uniq. That'll only catch duplicates which come one after another.

edited Sep 23, 2015 at 10:37

kenorb

169k95 gold badges712 silver badges796 bronze badges

answered Dec 8, 2008 at 22:32

derobert

51.4k15 gold badges98 silver badges126 bronze badges

2 Comments

anteatersa Over a year ago

Just to note this will only work on computers with the uniq program installed i.e. Linux, Mac, Freebsd etc

kirin Over a year ago

This will be the best answer to those who don't need sorting. And if you are windows user, consider to try Cygwin or MSYS.

Luc Hermitte · Accepted Answer · 2008-12-09 10:05:48Z

Regarding how Uniq can be implemented in VimL, search for Uniq in a plugin I'm maintaining. You'll see various ways to implement it that were given on Vim mailing-list.

Otherwise, :sort u is indeed the way to go.

william-1066 · Accepted Answer · 2018-10-16 11:20:01Z

An alternative method that does not use vi/vim (for very large files), is from the Linux command line use sort and uniq:

sort {file-name} | uniq -u

horta · Accepted Answer · 2024-06-25 19:53:21Z

From here this will remove adjacent and non-adjacent duplicates without sorting:

:%!awk '\!a[$0]++'

This technically uses something outside of vim, but is called from within vim (and therefore only works in linux which has awk).

To do this entirely from within vim you can do this using a macro and the norm command to execute it on every line. On linux, this was fast, but on windows it took an oddly long time. Disabling plugins using vim -u NONE seemed to help somewhat.

qa # create macro in register 'a' y$ # yank the current line :.+1,$g/<ctrl-r>0/d # from the next line to the end of file, delete any pattern that matches q # end of macro :%norm! @a # apply macro on every line in file.

Note this doesn't remove empty lines so performing

:g/^$/d

to remove any blank spaces may be useful.

cn8341 · Accepted Answer · 2014-04-30 06:45:51Z

:%s/^\(.*\)\(\n\1\)\+$/\1/gec

or

:%s/^\(.*\)\(\n\1\)\+$/\1/ge

this is my answer for you ,it can remove multiple duplicate lines and only keep one not remove !

kenorb · Accepted Answer · 2015-09-23 10:37:00Z

0

I would use !}uniq, but that only works if there are no blank lines.

For every line in a file use: :1,$!uniq.

edited Sep 23, 2015 at 10:37

kenorb

169k95 gold badges712 silver badges796 bronze badges

answered Dec 8, 2008 at 22:34

Chris Dodd

2,96815 silver badges10 bronze badges

Comments

SergioAraujo · Accepted Answer · 2018-03-19 20:36:00Z

This version only removes repeated lines that are contigous. I mean, only deletes consecutive repeated lines. Using the given map the function does note mess up with blank lines. But if change the REGEX to match start of line ^ it will also remove duplicated blank lines.

" function to delete duplicate lines function! DelDuplicatedLines() while getline(".") == getline(line(".") - 1) exec 'norm! ddk' endwhile while getline(".") == getline(line(".") + 1) exec 'norm! dd' endwhile endfunction nnoremap <Leader>d :g/./call DelDuplicatedLines()<CR>

Evan · Accepted Answer · 2024-01-08 05:14:08Z

This command got me a buffer without any duplicate lines without sorting, and it shouldn't be very hard to research why it works or how it could work better:

:%!python3.11 -c 'exec("import fileinput\nLINES = []\nfor line in fileinput.input():\n line = line.splitlines()[0]\n if line not in LINES:\n print(line)\n LINES.append(line)\n")'

paul · Accepted Answer · 2018-10-17 10:02:02Z

This worked for me for both .csv and .txt

awk '!seen[$0]++' <filename> > <newFileName>

Explanation: The first part of the command prints unique rows and the second part i.e. after the middle arrow is to save the output of the first part.

awk '!seen[$0]++' <filename>

>

<newFileName>

Collectives™ on Stack Overflow

Removing duplicate rows in vi?

16 Answers 16

10 Comments

7 Comments

5 Comments

3 Comments

1 Comment

1 Comment

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

16 Answers 16

10 Comments

7 Comments

5 Comments

3 Comments

1 Comment

1 Comment

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Related