Sorting in bash

Question

I have been trying to get the unique values in each column of a tab delimited file in bash. So, I used the following command.

cut -f <column_number> <filename> | sort | uniq -c

It works fine and I can get the unique values in a column and its count like

105 Linux 55 MacOS 500 Windows

What I want to do is instead of sorting by the column value names (which in this example are OS names) I want to sort them by count and possibly have the count in the second column in this output format. So It will have to look like:

Windows 500 MacOS 105 Linux 55

How do I do this?

paxdiablo · Accepted Answer · 2023-06-06 00:02:38Z

You can use (where N is the column number and F is the input file):

cut -f N F |sort |uniq -c |sort -nrk1,1 |awk '{print $2" "$1}'

The initial sort/uniq is to get each OS in the form <count> <os> so that the rest of the pipeline can work on it.

The sort -nrk1,1 sorts numerically (n), in reverse order (r), using the first field (-k1,1).

The awk then simply reverses the order of the columns. You can test the full pipeline with the following:

pax> cat test.in a Windows b Linux c Windows d Windows e Linux f Windows g MacOS h Linux i Windows j MacOS k Windows l Linux m MacOS n Windows o Linux p MacOS q Windows r Linux s Linux t Linux u Linux v Linux pax> cut -f2 test.in |sort |uniq -c |sort -nrk1,2 |awk '{print $2" "$1}' Linux 10 Windows 8 MacOS 4

This test file format is similar in style to your own input, including tabs separating the fields. It's unlikely to be the exact same format so you'll need to tailor the cut command to your own file, in such a way that it only gives you the desired column.

However, you've probably already done that since that's not the bit you're asking about.

sourcerebels · Accepted Answer · 2010-08-18 08:43:09Z

2

Mine:

cut -f <column_number> <filename> | sort | uniq -c | awk '{ print $2" "$1}' | sort

This will alter the column order (awk) and then just sort the output.

Hope this will help you

answered Aug 18, 2010 at 8:43

sourcerebels

5,2181 gold badge35 silver badges53 bronze badges

2 Comments

Dennis Williamson Over a year ago

That sorts by name rather than count.

sourcerebels Over a year ago

Sure, from sfactor question: "What I want to do is instead of sorting by the column value names"

karthiksatyanarayana · Accepted Answer · 2019-04-19 12:03:18Z

Using sed based on Tagged RE:

cut -f <column_number> <filename> | sort | uniq -c | sort -r -k1 -n | sed 's/\([0-9]*\)[ ]*\(.*\)/\2 \1/'

Doesn't produce output in a neat format though.

Collectives™ on Stack Overflow

Sorting in bash

3 Answers 3

Comments

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Linked

Related