I am getting output from a program that first produces one line that is a bunch of column headers, and then a bunch of lines of data. I want to cut various columns of this output and view it sorted according to various columns. Without the headers, the cutting and sorting is easily accomplished via the -k option to sort along with cut or awk to view a subset of the columns. However, this method of sorting mixes the column headers in with the rest of the lines of output. Is there an easy way to keep the headers at the top?
16 Answers
Stealing Andy's idea and making it a function so it's easier to use:
# print the header (the first line of input) # and then run the specified command on the body (the rest of the input) # use it in a pipeline, e.g. ps | body grep somepattern body() { IFS= read -r header printf '%s\n' "$header" "$@" } Now I can do:
$ ps -o pid,comm | body sort -k2 PID COMMAND 24759 bash 31276 bash 31032 less 31177 less 31020 man 31167 man ... $ ps -o pid,comm | body grep less PID COMMAND 31032 less 31177 less -
ps -C COMMANDmay be more appropriate thangrep COMMAND, but it's just an example. Also, you can't use-Cif you also used another selection option such as-U.Mikel– Mikel2011-04-23 00:51:12 +00:00Commented Apr 23, 2011 at 0:51 - 4Renamed from
headertobody, because you're doing the action on the body. Hopefully that makes more sense.Mikel– Mikel2011-04-23 01:02:18 +00:00Commented Apr 23, 2011 at 1:02 - 4Remember to call
bodyon all subsequent pipeline participants:ps -o pid,comm | body grep less | body sort -k1nrbishop– bishop2016-11-07 20:02:10 +00:00Commented Nov 7, 2016 at 20:02 - 2@Tim You can just write
<foo body sort -k2orbody sort -k2 <foo. Just one extra character from what you wanted.Mikel– Mikel2017-09-04 13:49:39 +00:00Commented Sep 4, 2017 at 13:49 - 1Slight side note: I know this is a generic solution, but I just wanted to point out that the
pscommand has the ability to sort (at least in some versions). You can dops -o pid,comm --sort command it'll sort by that column. Also--sort -commwill sort in reverse order.HerbCSO– HerbCSO2022-08-23 22:13:16 +00:00Commented Aug 23, 2022 at 22:13
You can keep the header at the top like this with bash:
command | (read -r; printf "%s\n" "$REPLY"; sort) Or do it with perl:
command | perl -e 'print scalar (<>); print sort { ... } <>' - 1(read;...) seems to lose the spacing between the fields of the header for me. Any suggestions?jonderry– jonderry2011-04-23 01:17:21 +00:00Commented Apr 23, 2011 at 1:17
- @jonderry: Change
readtoIFS= read.Mikel– Mikel2011-04-23 01:25:30 +00:00Commented Apr 23, 2011 at 1:25 - 3
IFS=disables word splitting when reading the input. I don't think it's necessary when reading to$REPLY.echowill expand backslash escapes ifxpg_echois set (not the default);printfis safer in that case.echo $REPLYwithout quotes will condense whitespace; I thinkecho "$REPLY"should be okay.read -ris needed if the input may contain backslash escapes. Some of this might depend on bash version.Andy– Andy2011-04-23 01:50:08 +00:00Commented Apr 23, 2011 at 1:50 - 1@Andy: Wow, you're right, different rules for
read REPLY; echo $REPLY(strips leading spaces) andread; echo $REPLY(doesn't).Mikel– Mikel2011-04-23 02:44:55 +00:00Commented Apr 23, 2011 at 2:44 - 1@Andy: IIRC, the default value of
xpg_echodepends on your system, e.g. on Solaris I think it defaults to true. This is why Gilles likesprintfso much: it's the only thing with predictable behavior.Mikel– Mikel2011-04-23 02:47:39 +00:00Commented Apr 23, 2011 at 2:47
I found a nice awk version that works nicely in scripts:
awk 'NR == 1; NR > 1 {print $0 | "sort -n"}' - 4I like this, but it requires a bit of explanation - the pipe is inside the awk script. How does that work? Is it calling the
sortcommand externally? Does anyone know of at least a link to a page explaining pipe use within awk?Wildcard– Wildcard2015-11-07 01:24:50 +00:00Commented Nov 7, 2015 at 1:24 - @Wildcard you can check the official manual page or this primer.lapo– lapo2016-11-02 19:52:45 +00:00Commented Nov 2, 2016 at 19:52
-
- For fixed-width output, use the
-boption, as it will makesortignore leading blanks in the sort key. The default field separator is non-blank-to-blank transitions, so fields will start with leading blanks. For example, this command lists installed Python packages first by location, then by package name:pip list -v | awk 'NR <= 2; NR > 2 { print $0 | "sort -b -k 3,3 -k 1,1" };'aparkerlue– aparkerlue2021-05-13 16:38:45 +00:00Commented May 13, 2021 at 16:38 - 1Note, pipes inside
awkmay need to be followed byclose("sort --exact-args...")to prevent buffering from printing this after later prints.Excalibur– Excalibur2021-12-29 18:31:13 +00:00Commented Dec 29, 2021 at 18:31
The pee command from moreutils is designed for tasks like this.
Example:
To keep one header line, and sort the second (numeric) column in stdin:
<your command> | pee 'head -n 1' 'tail -n +2 | sort -k 2,2 -n' Explanation:
pee : pipe stdin to one or more commands and concatenate the results.
head -n 1 : Print the first line of stdin.
tail -n +2 : Print the second and following lines from stdin.
sort -k 2,2 -n : Numerically sort by the second column.
Test:
printf "header\na 1\nc 3\nb 2\n" | pee 'head -n 1' 'tail -n +2 | sort -k 2,2 -n' gives
header a 1 b 2 c 3 - 3This is a great solution because it's easily memorizable: I just have to remember
peeand then use regular commands I already know likeheadorsort. That also makes it easily adaptable to other use cases. Thanks a lot!Jens Bannmann– Jens Bannmann2023-06-03 08:03:36 +00:00Commented Jun 3, 2023 at 8:03
Hackish but effective: prepend 0 to all header lines and 1 to all other lines before sorting. Strip the prefix after sorting.
… | awk '{print (NR <= 2 ? "0 " : "1 ") $0}' | sort -k 1 -k… | cut -b 3- - 1aka the Decorate-Sort-Undecorate idiom. Not hackish at all IMO.Ed Morton– Ed Morton2024-03-20 18:39:38 +00:00Commented Mar 20, 2024 at 18:39
Here's some magic perl line noise that you can pipe your output through to sort everything but keep the first line at the top: perl -e 'print scalar <>, sort <>;'
- 2could you pls explain why this works?törzsmókus– törzsmókus2020-09-07 09:44:59 +00:00Commented Sep 7, 2020 at 9:44
I tried the command | {head -1; sort; } solution and can confirm that it really screws things up--head reads in multiple lines from the pipe, then outputs just the first one. So the rest of the output, that head did not read, is passed to sort--NOT the rest of the output starting from line 2!
The result is that you are missing lines (and one partial line!) that were in the beginning of your command output (except you still have the first line) - a fact that is easy to confirm by adding a pipe to wc at the end of the above pipeline - but that is extraordinarily difficult to trace down if you don't know this! I spent at least 20 minutes trying to work out why I had a partial line (first 100 bytes or so cut off) in my output before solving it.
What I ended up doing, which worked beautifully and didn't require running the command twice, was:
myfile=$(mktemp) whatever command you want to run > $myfile head -1 $myfile sed 1d $myfile | sort rm $myfile If you need to put the output into a file, you can modify this to:
myfile=$(mktemp) whatever command you want to run > $myfile head -1 $myfile > outputfile sed 1d $myfile | sort >> outputfile rm $myfile - You can use ksh93's
headbuiltin or thelineutility (on systems that still have one) orgnu-sed -u qorIFS=read -r line; printf '%s\n' "$line", that read the input one byte at a time to avoid that.Stéphane Chazelas– Stéphane Chazelas2018-01-11 21:58:28 +00:00Commented Jan 11, 2018 at 21:58
I think this is easiest.
ps -ef | ( head -n 1 ; sort ) or this which is possibly faster as it does not create a sub shell
ps -ef | { head -n 1 ; sort ; } Other cool uses
shuffle lines after header row
cat file.txt | ( head -n 1 ; shuf ) reverse lines after header row
cat file.txt | ( head -n 1 ; tac ) - 3See unix.stackexchange.com/questions/11856/…. This is not actually a good solution.Wildcard– Wildcard2015-11-06 21:43:38 +00:00Commented Nov 6, 2015 at 21:43
- 4Not working,
cat file | { head -n 1 ; sort ; } > file2only show headPeter Krauss– Peter Krauss2018-07-06 19:19:08 +00:00Commented Jul 6, 2018 at 19:19 - Does not work, only shows header line.Dave– Dave2025-01-03 20:53:17 +00:00Commented Jan 3 at 20:53
Simple and straightforward!
<command> | head -n 1; <command> | sed 1d | sort <....> - sed nd ---> 'n' specifies line no., and 'd' stands for delete.
- 2Just as jofel commented a year and a half ago on Sarva's answer, this starts
commandtwice. So not really suitable for use in a pipeline.Wildcard– Wildcard2015-11-06 02:36:58 +00:00Commented Nov 6, 2015 at 2:36
I came here looking for a solution for the command w. This command shows details of who is logged in and what they are doing.
To show the results sorted, but with the headers kept at the top (there are 2 lines of headers), I settled on:
w | head -n 2; w | tail -n +3 | sort Obviously this runs the command w twice and therefore may not be suitable for all situations. However, to its advantage it is substantially easier to remember.
Note that the tail -n +3 means 'show all lines from the 3rd onwards' (see man tail for details).
Expanding on @Mikel's answer, here is a version of the body() function that adds a few features:
It detects if there is input coming in on a pipe, and if not prints out usage information to STDERR.
If no command is given, it uses
sortas the default.If the first parameter is a number, it uses that number as the number of header lines (default 1)
In testing, it works on Linux bash and macOS zsh
I made a gist at github: https://gist.github.com/alanhoyle/7ec6bd445a790b62567d8b1ff6941c66
Thus:
body() { local HEADER_LINES=1 local COMMAND="sort" if [ -t 0 ]; then >&2 echo "ERROR: body requires piped input!" >&2 echo "body: prints the header from a STDIN and sends the 'body' to another command for" >&2 echo " additional processing. Useful for sort/grep when you want to keep headers" >&2 echo "USAGE: COMMAND | body [ N ] [ COMMAND_TO_PROCESS_OUTPUT ]" >&2 echo " if the first parameter N is a whole number, it prints that number of lines" >&2 echo " before proceeding [ default: skip $HEADER_LINES ]" >&2 echo " if the [ COMMAND_TO PROCESS_OUTPUT ] is omitted, '$COMMAND' is used" return 1 fi local re='^[0-9]+$' if [[ $1 =~ $re ]] ; then HEADER_LINES=$1 shift >&2 echo "body: skipping $HEADER_LINES" fi local THIS_COMMAND=$@ if [ -z "$THIS_COMMAND" ] ; then >&2 echo "body: running default $COMMAND" fi for line in $(eval echo "{1..$HEADER_LINES}") do IFS= read -r header printf '%s\n' "$header" done if [ -z "$THIS_COMMAND" ] ; then ( $COMMAND ) else "$@" fi } Example:
$ body ERROR: body requires piped input! body: prints the header from a STDIN and sends the 'body' to another command for additional processing. Useful for sort/grep when you want to keep headers USAGE: COMMAND | body [ N ] [ COMMAND_TO_PROCESS_OUTPUT ] if the first parameter N is a whole number, it prints that number of lines before proceeding [ default: skip 1 ] if the [ COMMAND_TO PROCESS_OUTPUT ] is omitted, 'sort' is used $ echo -e "header\n30\n33\n20" header 30 33 20 $ echo -e "header\n30\n33\n20" | body body: running sort by default header 20 30 33 $ echo -e "header\n30\n33\n20" | body grep 0 header 30 20 $ echo -e "header\n30\n33\n20" | body 2 body: skipping 2 body: running sort by default header 30 20 33 Basically, you need something that reads one line and only one line from the input and outputs it and then leave the rest of the input to sort.
There are quite a few utilities that can read one line and print it:
head -n 1sed qawk '{print; exit}'
But most implementations of those read their input in chunks, and will generally end up reading more than one line. On seekable input, they're able to rewind upon exit to just after the first line, but they can't do that on pipes or other non-seekable input.
You need an utility that give you a guarantee they don't read past the end of the first line. The options are:
line: that used to be a standard utility but was obsoleted by POSIX on the ground that it was redundant with thereadbuiltin ofsh. That read lines one byte at a time, and output it. It was always outputting a line, even when there was none or a non-delimited one on input.sed -u q: somesedimplementations support a-uoption for unbuffered and some of those that support it, with it also read their input one byte at a time. You also need asedimplementation that doesn't read one line in advance when the$address is not used. Which probably doesn't leave many implementations besides GNUsed. GNUsedalso outputs a full line if the input only had a non-delimited line.IFS= read -r line: that reads up to one line and is guaranteed not read past the end of the line. Except for zsh'sreadbuiltin, it can't cope with NUL bytes. It doesn't print the line it has read, but you can useprintffor that. Withzsh,read -rereads the line andechoes it; it adds a newline character if missing on input.
So your best bet in sh-like shells would be:
sort_body() ( if IFS= read -r line; then printf '%s\n' "$line" && exec sort "$@" else # no input or only a non-delimited header line printf %s "$line" # no point in running sort as there's no input left fi ) Then:
cmd | sort_body -nk1,1 .. <file sort_body -u (not sort_body -u file, the thing to sort has to be passed on sort_body's stdin).
If those are CSVs or TSVs (or more see manual), that sounds like a job for mlr (miller).
Like with a file looking like:
$ cat /usr/share/distro-info/debian.csv version,codename,series,created,release,eol,eol-lts,eol-elts 1.1,Buzz,buzz,1993-08-16,1996-06-17,1997-06-05 1.2,Rex,rex,1996-06-17,1996-12-12,1998-06-05 1.3,Bo,bo,1996-12-12,1997-06-05,1999-03-09 2.0,Hamm,hamm,1997-06-05,1998-07-24,2000-03-09 2.1,Slink,slink,1998-07-24,1999-03-09,2000-10-30 2.2,Potato,potato,1999-03-09,2000-08-15,2003-07-30 [...] $ mlr --ragged --csv cut -f codename,created then sort -f codename /usr/share/distro-info/debian.csv codename,created Bo,1996-12-12 Bookworm,2021-08-14 Bullseye,2019-07-06 Buster,2017-06-17 Buzz,1993-08-16 Etch,2005-06-06 [...] That is, the order is not only preserved, but the field names in there can also be used in the cut or sort specifications.
Using Raku (formerly known as Perl_6)
~$ raku -e '.put for "\x0061".."\x07A";' | raku -e 'put get; .put for lines.sort.reverse.head(10);' #OR ~$ raku -e '.put for "\x0061".."\x07A";' | raku -e 'put lines[0]; .put for lines[1..*].sort.reverse.head(10);' Sample Input: English alphabet, one letter per line
Sample Output (truncated to first 10 lines via .head(10):
a z y x w v u t s r q Answering this to complement Perl answers already posted. The put get call 1. 'gets' a single line and out-'puts' it, then 2. advances the read cursor so the first line isn't read again (e.g. by lines). If you need to read a 2-line header (for example), use (put get) xx 2.
When sorting a file, sometimes you want to filter a little first--an example is removing blank lines. That's easy with Raku, simply interpose a call to .map({$_ if .chars}) after the call to lines (and before the call to sort).
A nice advantage of Raku is built-in, high-level support for Unicode. A Cyrillic alphabet equivalent of the Raku code at top is as follows:
~$ raku -e '.put for "\x0430".."\x044F";' | raku -e 'put get; .put for lines.sort.reverse.head(10);' OR, taking input off the command line:
~$ raku -e '.put for "\x0430".."\x044F";' > Cyrillic.txt ~$ raku -e 'put lines[0]; .put for lines[1..*].sort.reverse.head(10);' Cyrillic.txt Sample Output (either Cyrillic example above):
а я ю э ь ы ъ щ ш ч ц See URLs below for further discussion on the Raku/Perl6 mailing list regarding how to translate Perl(5) file-input idioms into Raku.
https://www.nntp.perl.org/group/perl.perl6.users/2018/11/msg6295.html
https://www.nntp.perl.org/group/perl.perl6.users/2019/07/msg6825.html
command | head -1; command | tail -n +2 | sort - 4This starts
commandtwo times. Therefore it is limited to some specific commands. However, for the requestedpscommand in the example, it would work.jofel– jofel2014-05-20 12:00:58 +00:00Commented May 20, 2014 at 12:00
Try doing:
wc -l file_name | tail -n $(awk '{print $1-1}') file_name | sort - 4i do not get itPierre.Vriens– Pierre.Vriens2018-01-11 22:17:58 +00:00Commented Jan 11, 2018 at 22:17
{ head -1; sort; }to work. It always deletes a bunch of the text after the first line. Does anyone know why this happens?headis reading more than one line into a buffer and throwing most of it away. Mysedidea had the same problem.lseekable input so it won't work when reading from a pipe. It will work if you redirect to a file>outfileand then run{ head -n 1; sort; } <outfile