sort but keep header line at the top

Question

I am getting output from a program that first produces one line that is a bunch of column headers, and then a bunch of lines of data. I want to cut various columns of this output and view it sorted according to various columns. Without the headers, the cutting and sorting is easily accomplished via the -k option to sort along with cut or awk to view a subset of the columns. However, this method of sorting mixes the column headers in with the rest of the lines of output. Is there an easy way to keep the headers at the top?

I came across the following link. However, I can't get this technique of { head -1; sort; } to work. It always deletes a bunch of the text after the first line. Does anyone know why this happens? — jonderry
– jonderry, Commented Apr 23, 2011 at 1:02
I suspect it's because head is reading more than one line into a buffer and throwing most of it away. My sed idea had the same problem. — Andy
– Andy, Commented Apr 23, 2011 at 1:09
@jonderry - that technique only works with lseekable input so it won't work when reading from a pipe. It will work if you redirect to a file >outfile and then run { head -n 1; sort; } <outfile — don_crissti
– don_crissti, Commented Sep 26, 2015 at 13:40
@jonderry I wonder if a specific line ending is observed in your particular tool. Some "Windows" command line tools are still coded for text processing of Linux line endings — Sun
– Sun, Commented Feb 4, 2020 at 3:25

dessert · Accepted Answer · 2018-03-02 12:11:40Z

95

Stealing Andy's idea and making it a function so it's easier to use:

# print the header (the first line of input) # and then run the specified command on the body (the rest of the input) # use it in a pipeline, e.g. ps | body grep somepattern body() { IFS= read -r header printf '%s\n' "$header" "$@" }

Now I can do:

$ ps -o pid,comm | body sort -k2 PID COMMAND 24759 bash 31276 bash 31032 less 31177 less 31020 man 31167 man ... $ ps -o pid,comm | body grep less PID COMMAND 31032 less 31177 less

edited Mar 2, 2018 at 12:11

dessert

1,73516 silver badges30 bronze badges

answered Apr 23, 2011 at 0:44

Mikel

58.7k16 gold badges136 silver badges155 bronze badges

ps -C COMMAND may be more appropriate than grep COMMAND, but it's just an example. Also, you can't use -C if you also used another selection option such as -U.

Mikel
– Mikel

2011-04-23 00:51:12 +00:00
Commented Apr 23, 2011 at 0:51
4

Renamed from header to body, because you're doing the action on the body. Hopefully that makes more sense.

Mikel
– Mikel

2011-04-23 01:02:18 +00:00
Commented Apr 23, 2011 at 1:02
4

Remember to call body on all subsequent pipeline participants: ps -o pid,comm | body grep less | body sort -k1nr

bishop
– bishop

2016-11-07 20:02:10 +00:00
Commented Nov 7, 2016 at 20:02
2

@Tim You can just write <foo body sort -k2 or body sort -k2 <foo. Just one extra character from what you wanted.

Mikel
– Mikel

2017-09-04 13:49:39 +00:00
Commented Sep 4, 2017 at 13:49
1

Slight side note: I know this is a generic solution, but I just wanted to point out that the ps command has the ability to sort (at least in some versions). You can do ps -o pid,comm --sort comm and it'll sort by that column. Also --sort -comm will sort in reverse order.

HerbCSO
– HerbCSO

2022-08-23 22:13:16 +00:00
Commented Aug 23, 2022 at 22:13

| Show 8 more comments

Andy · Accepted Answer · 2011-04-23 01:53:01Z

64

You can keep the header at the top like this with bash:

command | (read -r; printf "%s\n" "$REPLY"; sort)

Or do it with perl:

command | perl -e 'print scalar (<>); print sort { ... } <>'

edited Apr 23, 2011 at 1:53

answered Apr 23, 2011 at 0:32

Andy

3,0271 gold badge25 silver badges14 bronze badges

1

(read;...) seems to lose the spacing between the fields of the header for me. Any suggestions?

jonderry
– jonderry

2011-04-23 01:17:21 +00:00
Commented Apr 23, 2011 at 1:17
@jonderry: Change read to IFS= read.

Mikel
– Mikel

2011-04-23 01:25:30 +00:00
Commented Apr 23, 2011 at 1:25
3

IFS= disables word splitting when reading the input. I don't think it's necessary when reading to $REPLY. echo will expand backslash escapes if xpg_echo is set (not the default); printf is safer in that case. echo $REPLY without quotes will condense whitespace; I think echo "$REPLY" should be okay. read -r is needed if the input may contain backslash escapes. Some of this might depend on bash version.

Andy
– Andy

2011-04-23 01:50:08 +00:00
Commented Apr 23, 2011 at 1:50
1

@Andy: Wow, you're right, different rules for read REPLY; echo $REPLY (strips leading spaces) and read; echo $REPLY (doesn't).

Mikel
– Mikel

2011-04-23 02:44:55 +00:00
Commented Apr 23, 2011 at 2:44
1

@Andy: IIRC, the default value of xpg_echo depends on your system, e.g. on Solaris I think it defaults to true. This is why Gilles likes printf so much: it's the only thing with predictable behavior.

Mikel
– Mikel

2011-04-23 02:47:39 +00:00
Commented Apr 23, 2011 at 2:47

| Show 4 more comments

Michael Kuhn · Accepted Answer · 2013-04-10 12:51:05Z

46

I found a nice awk version that works nicely in scripts:

awk 'NR == 1; NR > 1 {print $0 | "sort -n"}'

answered Apr 10, 2013 at 12:51

Michael Kuhn

6015 silver badges8 bronze badges

4

I like this, but it requires a bit of explanation - the pipe is inside the awk script. How does that work? Is it calling the sort command externally? Does anyone know of at least a link to a page explaining pipe use within awk?

Wildcard
– Wildcard

2015-11-07 01:24:50 +00:00
Commented Nov 7, 2015 at 1:24
@Wildcard you can check the official manual page or this primer.

lapo
– lapo

2016-11-02 19:52:45 +00:00
Commented Nov 2, 2016 at 19:52
This code fails when I use these arguments to sort: sort -n -k 2b,2 -t $'\t'. The problem is nesting '\t' inside 'NR...{print...}'. The explanation of how to escape the 's is here

Josh
– Josh

2020-03-28 17:30:28 +00:00
Commented Mar 28, 2020 at 17:30
For fixed-width output, use the -b option, as it will make sort ignore leading blanks in the sort key. The default field separator is non-blank-to-blank transitions, so fields will start with leading blanks. For example, this command lists installed Python packages first by location, then by package name: pip list -v | awk 'NR <= 2; NR > 2 { print $0 | "sort -b -k 3,3 -k 1,1" };'

aparkerlue
– aparkerlue

2021-05-13 16:38:45 +00:00
Commented May 13, 2021 at 16:38
1

Note, pipes inside awk may need to be followed by close("sort --exact-args...") to prevent buffering from printing this after later prints.

Excalibur
– Excalibur

2021-12-29 18:31:13 +00:00
Commented Dec 29, 2021 at 18:31

Add a comment |

freeB · Accepted Answer · 2023-07-07 17:56:45Z

The pee command from moreutils is designed for tasks like this.

Example:

To keep one header line, and sort the second (numeric) column in stdin:

<your command> | pee 'head -n 1' 'tail -n +2 | sort -k 2,2 -n'

Explanation:

pee : pipe stdin to one or more commands and concatenate the results.

head -n 1 : Print the first line of stdin.

tail -n +2 : Print the second and following lines from stdin.

sort -k 2,2 -n : Numerically sort by the second column.

Test:

printf "header\na 1\nc 3\nb 2\n" | pee 'head -n 1' 'tail -n +2 | sort -k 2,2 -n'

gives

header a 1 b 2 c 3

This is a great solution because it's easily memorizable: I just have to remember pee and then use regular commands I already know like head or sort. That also makes it easily adaptable to other use cases. Thanks a lot! — Jens Bannmann
– Jens Bannmann, Commented Jun 3, 2023 at 8:03

Gilles 'SO- stop being evil' · Accepted Answer · 2020-04-07 16:39:42Z

Hackish but effective: prepend 0 to all header lines and 1 to all other lines before sorting. Strip the prefix after sorting.

… | awk '{print (NR <= 2 ? "0 " : "1 ") $0}' | sort -k 1 -k… | cut -b 3-

aka the Decorate-Sort-Undecorate idiom. Not hackish at all IMO. — Ed Morton
– Ed Morton, Commented Mar 20, 2024 at 18:39

Ryan C. Thompson · Accepted Answer · 2011-04-23 06:32:48Z

Here's some magic perl line noise that you can pipe your output through to sort everything but keep the first line at the top: perl -e 'print scalar <>, sort <>;'

could you pls explain why this works?

törzsmókus
– törzsmókus

2020-09-07 09:44:59 +00:00
Commented Sep 7, 2020 at 9:44 — törzsmókus
– törzsmókus, Commented Sep 7, 2020 at 9:44

Wildcard · Accepted Answer · 2015-11-07 01:21:47Z

I tried the command | {head -1; sort; } solution and can confirm that it really screws things up--head reads in multiple lines from the pipe, then outputs just the first one. So the rest of the output, that head did not read, is passed to sort--NOT the rest of the output starting from line 2!

The result is that you are missing lines (and one partial line!) that were in the beginning of your command output (except you still have the first line) - a fact that is easy to confirm by adding a pipe to wc at the end of the above pipeline - but that is extraordinarily difficult to trace down if you don't know this! I spent at least 20 minutes trying to work out why I had a partial line (first 100 bytes or so cut off) in my output before solving it.

What I ended up doing, which worked beautifully and didn't require running the command twice, was:

myfile=$(mktemp) whatever command you want to run > $myfile head -1 $myfile sed 1d $myfile | sort rm $myfile

If you need to put the output into a file, you can modify this to:

myfile=$(mktemp) whatever command you want to run > $myfile head -1 $myfile > outputfile sed 1d $myfile | sort >> outputfile rm $myfile

You can use ksh93's head builtin or the line utility (on systems that still have one) or gnu-sed -u q or IFS=read -r line; printf '%s\n' "$line", that read the input one byte at a time to avoid that. — Stéphane Chazelas
– Stéphane Chazelas, Commented Jan 11, 2018 at 21:58

don_crissti · Accepted Answer · 2018-01-11 15:15:39Z

1

I think this is easiest.

ps -ef | ( head -n 1 ; sort )

or this which is possibly faster as it does not create a sub shell

ps -ef | { head -n 1 ; sort ; }

Other cool uses

shuffle lines after header row

cat file.txt | ( head -n 1 ; shuf )

reverse lines after header row

cat file.txt | ( head -n 1 ; tac )

edited Jan 11, 2018 at 15:15

don_crissti

85.7k31 gold badges234 silver badges262 bronze badges

answered Nov 5, 2015 at 17:42

user2449151

672 bronze badges

3

See unix.stackexchange.com/questions/11856/…. This is not actually a good solution.

Wildcard
– Wildcard

2015-11-06 21:43:38 +00:00
Commented Nov 6, 2015 at 21:43
4

Not working, cat file | { head -n 1 ; sort ; } > file2 only show head

Peter Krauss
– Peter Krauss

2018-07-06 19:19:08 +00:00
Commented Jul 6, 2018 at 19:19
Does not work, only shows header line.

Dave
– Dave

2025-01-03 20:53:17 +00:00
Commented Jan 3 at 20:53

Add a comment |

Jatsui · Accepted Answer · 2015-11-04 01:22:32Z

0

Simple and straightforward!

<command> | head -n 1; <command> | sed 1d | sort <....>

sed nd ---> 'n' specifies line no., and 'd' stands for delete.

edited Nov 4, 2015 at 1:22

answered Nov 3, 2015 at 18:05

Jatsui

953 bronze badges

2

Just as jofel commented a year and a half ago on Sarva's answer, this starts command twice. So not really suitable for use in a pipeline.

Wildcard
– Wildcard

2015-11-06 02:36:58 +00:00
Commented Nov 6, 2015 at 2:36

Add a comment |

Robert · Accepted Answer · 2017-01-19 08:59:08Z

I came here looking for a solution for the command w. This command shows details of who is logged in and what they are doing.

To show the results sorted, but with the headers kept at the top (there are 2 lines of headers), I settled on:

w | head -n 2; w | tail -n +3 | sort

Obviously this runs the command w twice and therefore may not be suitable for all situations. However, to its advantage it is substantially easier to remember.

Note that the tail -n +3 means 'show all lines from the 3rd onwards' (see man tail for details).

Aphoid · Accepted Answer · 2023-05-30 14:26:03Z

Expanding on @Mikel's answer, here is a version of the body() function that adds a few features:

It detects if there is input coming in on a pipe, and if not prints out usage information to STDERR.
If no command is given, it uses sort as the default.
If the first parameter is a number, it uses that number as the number of header lines (default 1)

In testing, it works on Linux bash and macOS zsh

I made a gist at github: https://gist.github.com/alanhoyle/7ec6bd445a790b62567d8b1ff6941c66

Thus:

body() { local HEADER_LINES=1 local COMMAND="sort" if [ -t 0 ]; then >&2 echo "ERROR: body requires piped input!" >&2 echo "body: prints the header from a STDIN and sends the 'body' to another command for" >&2 echo " additional processing. Useful for sort/grep when you want to keep headers" >&2 echo "USAGE: COMMAND | body [ N ] [ COMMAND_TO_PROCESS_OUTPUT ]" >&2 echo " if the first parameter N is a whole number, it prints that number of lines" >&2 echo " before proceeding [ default: skip $HEADER_LINES ]" >&2 echo " if the [ COMMAND_TO PROCESS_OUTPUT ] is omitted, '$COMMAND' is used" return 1 fi local re='^[0-9]+$' if [[ $1 =~ $re ]] ; then HEADER_LINES=$1 shift >&2 echo "body: skipping $HEADER_LINES" fi local THIS_COMMAND=$@ if [ -z "$THIS_COMMAND" ] ; then >&2 echo "body: running default $COMMAND" fi for line in $(eval echo "{1..$HEADER_LINES}") do IFS= read -r header printf '%s\n' "$header" done if [ -z "$THIS_COMMAND" ] ; then ( $COMMAND ) else "$@" fi }

Example:

$ body ERROR: body requires piped input! body: prints the header from a STDIN and sends the 'body' to another command for additional processing. Useful for sort/grep when you want to keep headers USAGE: COMMAND | body [ N ] [ COMMAND_TO_PROCESS_OUTPUT ] if the first parameter N is a whole number, it prints that number of lines before proceeding [ default: skip 1 ] if the [ COMMAND_TO PROCESS_OUTPUT ] is omitted, 'sort' is used $ echo -e "header\n30\n33\n20" header 30 33 20 $ echo -e "header\n30\n33\n20" | body body: running sort by default header 20 30 33 $ echo -e "header\n30\n33\n20" | body grep 0 header 30 20 $ echo -e "header\n30\n33\n20" | body 2 body: skipping 2 body: running sort by default header 30 20 33

Stéphane Chazelas · Accepted Answer · 2023-07-07 20:20:42Z

Basically, you need something that reads one line and only one line from the input and outputs it and then leave the rest of the input to sort.

There are quite a few utilities that can read one line and print it:

head -n 1
sed q
awk '{print; exit}'

But most implementations of those read their input in chunks, and will generally end up reading more than one line. On seekable input, they're able to rewind upon exit to just after the first line, but they can't do that on pipes or other non-seekable input.

You need an utility that give you a guarantee they don't read past the end of the first line. The options are:

line: that used to be a standard utility but was obsoleted by POSIX on the ground that it was redundant with the read builtin of sh. That read lines one byte at a time, and output it. It was always outputting a line, even when there was none or a non-delimited one on input.
sed -u q: some sed implementations support a -u option for unbuffered and some of those that support it, with it also read their input one byte at a time. You also need a sed implementation that doesn't read one line in advance when the $ address is not used. Which probably doesn't leave many implementations besides GNU sed. GNU sed also outputs a full line if the input only had a non-delimited line.
IFS= read -r line: that reads up to one line and is guaranteed not read past the end of the line. Except for zsh's read builtin, it can't cope with NUL bytes. It doesn't print the line it has read, but you can use printf for that. With zsh, read -re reads the line and echoes it; it adds a newline character if missing on input.

So your best bet in sh-like shells would be:

sort_body() ( if IFS= read -r line; then printf '%s\n' "$line" && exec sort "$@" else # no input or only a non-delimited header line printf %s "$line" # no point in running sort as there's no input left fi )

Then:

cmd | sort_body -nk1,1 .. <file sort_body -u

(not ~~sort_body -u file~~, the thing to sort has to be passed on sort_body's stdin).

Stéphane Chazelas · Accepted Answer · 2023-07-07 20:46:33Z

If those are CSVs or TSVs (or more see manual), that sounds like a job for mlr (miller).

Like with a file looking like:

$ cat /usr/share/distro-info/debian.csv version,codename,series,created,release,eol,eol-lts,eol-elts 1.1,Buzz,buzz,1993-08-16,1996-06-17,1997-06-05 1.2,Rex,rex,1996-06-17,1996-12-12,1998-06-05 1.3,Bo,bo,1996-12-12,1997-06-05,1999-03-09 2.0,Hamm,hamm,1997-06-05,1998-07-24,2000-03-09 2.1,Slink,slink,1998-07-24,1999-03-09,2000-10-30 2.2,Potato,potato,1999-03-09,2000-08-15,2003-07-30 [...]

$ mlr --ragged --csv cut -f codename,created then sort -f codename /usr/share/distro-info/debian.csv codename,created Bo,1996-12-12 Bookworm,2021-08-14 Bullseye,2019-07-06 Buster,2017-06-17 Buzz,1993-08-16 Etch,2005-06-06 [...]

That is, the order is not only preserved, but the field names in there can also be used in the cut or sort specifications.

jubilatious1 · Accepted Answer · 2023-07-07 21:26:26Z

Using Raku (formerly known as Perl_6)

~$ raku -e '.put for "\x0061".."\x07A";' | raku -e 'put get; .put for lines.sort.reverse.head(10);' #OR ~$ raku -e '.put for "\x0061".."\x07A";' | raku -e 'put lines[0]; .put for lines[1..*].sort.reverse.head(10);'

Sample Input: English alphabet, one letter per line

Sample Output (truncated to first 10 lines via .head(10):

a z y x w v u t s r q

Answering this to complement Perl answers already posted. The put get call 1. 'gets' a single line and out-'puts' it, then 2. advances the read cursor so the first line isn't read again (e.g. by lines). If you need to read a 2-line header (for example), use (put get) xx 2.

When sorting a file, sometimes you want to filter a little first--an example is removing blank lines. That's easy with Raku, simply interpose a call to .map({$_ if .chars}) after the call to lines (and before the call to sort).

A nice advantage of Raku is built-in, high-level support for Unicode. A Cyrillic alphabet equivalent of the Raku code at top is as follows:

~$ raku -e '.put for "\x0430".."\x044F";' | raku -e 'put get; .put for lines.sort.reverse.head(10);'

OR, taking input off the command line:

~$ raku -e '.put for "\x0430".."\x044F";' > Cyrillic.txt ~$ raku -e 'put lines[0]; .put for lines[1..*].sort.reverse.head(10);' Cyrillic.txt

Sample Output (either Cyrillic example above):

а я ю э ь ы ъ щ ш ч ц

See URLs below for further discussion on the Raku/Perl6 mailing list regarding how to translate Perl(5) file-input idioms into Raku.

https://www.nntp.perl.org/group/perl.perl6.users/2018/11/msg6295.html
https://www.nntp.perl.org/group/perl.perl6.users/2019/07/msg6825.html

https://raku.org

Sarva · Accepted Answer · 2014-05-20 11:48:51Z

-1

command | head -1; command | tail -n +2 | sort

answered May 20, 2014 at 11:48

Sarva

1

4

This starts command two times. Therefore it is limited to some specific commands. However, for the requested ps command in the example, it would work.

jofel
– jofel

2014-05-20 12:00:58 +00:00
Commented May 20, 2014 at 12:00

Add a comment |

Kevdog777 · Accepted Answer · 2018-01-12 08:30:05Z

-3

Try doing:

wc -l file_name | tail -n $(awk '{print $1-1}') file_name | sort

edited Jan 12, 2018 at 8:30

Kevdog777

3,26418 gold badges45 silver badges66 bronze badges

answered Jan 11, 2018 at 21:50

Barry

11 bronze badge

4

i do not get it

Pierre.Vriens
– Pierre.Vriens

2018-01-11 22:17:58 +00:00
Commented Jan 11, 2018 at 22:17

Add a comment |

Stack Exchange Network

sort but keep header line at the top

16 Answers 16

You must log in to answer this question.

Linked

Hot Network Questions

sort but keep header line at the top

16 Answers 16

You must log in to answer this question.

Linked

Related

Hot Network Questions