The following shell script takes an optional -d option to set the delimiter (tab is default), as well as a non-optional -c option with a column specification.
The column specification is similar to that of cut but also allows for rearranging and duplicating the output columns, as well as specifying ranges backwards. Open ranges are also supported.
The file to parse is given on the command line as the last operand, or passed on standard input.
#!/bin/sh delim='\t' # tab is default delimiter # parse command line option while getopts 'd:c:' opt; do case $opt in d) delim=$OPTARG ;; c) cols=$OPTARG ;; *) echo 'Error in command line parsing' >&2 exit 1 esac done shift "$(( OPTIND - 1 ))" if [ -z "$cols" ]; then echo 'Missing column specification (the -c option)' >&2 exit 1 fi # ${1:--} will expand to the filename or to "-" if $1 is empty or unset cat "${1:--}" | awk -F "$delim" -v cols="$cols" ' BEGIN { # output delim will be same as input delim OFS = FS # get array of column specs ncolspec = split(cols, colspec, ",") } { # get fields of current line # (need this as we are rewriting $0 below) split($0, fields, FS) nf = NF # save NF in case we have an open-ended range $0 = ""; # empty $0 # go through given column specification and # create a record from it for (i = 1; i <= ncolspec; ++i) if (split(colspec[i], r, "-") == 1) # single column spec $(NF+1) = fields[colspec[i]] else { # column range spec if (r[1] == "") r[1] = 1 # open start range if (r[2] == "") r[2] = nf # open end range if (r[1] < r[2]) # forward range for (j = r[1]; j <= r[2]; ++j) $(NF + 1) = fields[j] else # backward range for (j = r[1]; j >= r[2]; --j) $(NF + 1) = fields[j] } print }'
There's a slight inefficiency in this as the code needs to re-parse the column specification for each new line. If support for open-ended ranges is not needed, or if all lines are assumed to have exactly the same number of columns, only a single pass over the specification can be done in the BEGIN block (or in a separat NR==1 block) to create an array of fields that should be outputted.
Missing: Sanity check for column specification. A malformed specification string may well cause weirdness.
Testing:
$ cat file 1:2:3 a:b:c @:(:)
$ sh script.sh -d : -c 1,3 <file 1:3 a:c @:)
$ sh script.sh -d : -c 3,1 <file 3:1 c:a ):@
$ sh script.sh -d : -c 3-1,1,1-3 <file 3:2:1:1:1:2:3 c:b:a:a:a:b:c ):(:@:@:@:(:)
$ sh script.sh -d : -c 1-,3 <file 1:2:3:3 a:b:c:c @:(:):)
cut? How would you want the command line to look? Whatever it looks like, it going to be a wrapper aroundcut, notawk.