Using awk to print all columns from the nth to the last

Question

This line worked until I had whitespace in the second field:

svn status | grep '\!' | gawk '{print $2;}' > removedProjs

Is there a way to have awk print everything in $2 or greater? ($3, $4.. until we don't have any more columns?)

I'm doing this in a Windows environment with Cygwin.

As an aside, the grep | awk is an antipattern -- you want awk '/!/ { print $2 }' — tripleee
– tripleee, Commented Sep 18, 2015 at 9:34
Unix "cut" is easier... svn status | grep '\!' | cut -d' ' -f2- > removedProjs — roblogic
– roblogic, Commented Sep 2, 2016 at 3:23
@tripleee: I'm so happy that you mentioned this - I'm frustrated at seeing it everywhere! — Graham Nicholls
– Graham Nicholls, Commented Nov 5, 2018 at 16:07
The duplicate stackoverflow.com/questions/2626274/… has some more discussions if you want to geek out. — tripleee
– tripleee, Commented Jul 5, 2024 at 7:02

zed_0xff · Accepted Answer · 2021-03-13 05:20:46Z

724

Print all columns:

awk '{print $0}' somefile

Print all but the first column:

awk '{$1=""; print $0}' somefile

Print all but the first two columns:

awk '{$1=$2=""; print $0}' somefile

edited Mar 13, 2021 at 5:20

user3064538

answered Jun 2, 2010 at 22:10

zed_0xff

33.5k8 gold badges55 silver badges72 bronze badges

Sign up to request clarification or add additional context in comments.

21 Comments

raphinesse Over a year ago

gotcha: leaves a leading space dangling about :(

themiurgo Over a year ago

@raphinesse you can fix that with awk '{$1=""; print substr($0,2)}' input_filename > output_filename

Ura Over a year ago

This doesn't work with non-whitespace delimiters, replaces them with a space.

cherdt Over a year ago

For non-whitespace delimiters, you can specify the Output Field Separator (OFS), e.g. to a comma: awk -F, -vOFS=, '{$1=""; print $0}' You will end up with an initial delimiter ($1 is still included, just as an empty string). You can strip that with sed though: awk -F, -vOFS=, '{$1=""; print $0}' | sed 's/^,//'

RD Ward Over a year ago

AWK is like the overly literal genie who grants three wishes

|

polynomial_donut · Accepted Answer · 2023-02-03 21:12:42Z

129

You could use a for-loop to loop through printing fields $2 through $NF (built-in variable that represents the number of fields on the line).

Edit: Since "print" appends a newline, you'll want to buffer the results:

awk '{out = ""; for (i = 2; i <= NF; i++) {out = out " " $i}; print out}'

Alternatively, use printf:

awk '{for (i = 2; i <= NF; i++) {printf "%s ", $i}; printf "\n"}'

edited Feb 3, 2023 at 21:12

polynomial_donut

3021 gold badge4 silver badges14 bronze badges

answered Jun 2, 2010 at 21:25

VeeArr

6,2183 gold badges26 silver badges47 bronze badges

12 Comments

Andy Over a year ago

So I tried this, but think I'm missing something.. here is what I did svn status | grep '\!' | gawk '{for (i=1; i<=$NF; i++)print $i " ";}' > removedProjs

VeeArr Over a year ago

Since print appends a newline, you'll want to buffer the results. See my edit.

Edward Falk Over a year ago

I like this answer better because it shows how to loop through fields.

Christian Lescuyer Over a year ago

If you want print to use a space, change the output record separator: awk '{ORS=" "; for(i=2;i<NF;i++) print $i}' somefile

Marki Over a year ago

There will always be some spaces too much. This works better: '{for(i=11;i<=NF-1;i++){printf "%s ", $i}; print $NF;}' No leading or trailing spaces.

|

Community · Accepted Answer · 2017-05-23 11:55:10Z

124

There's a duplicate question with a simpler answer using cut:

 svn status | grep '\!' | cut -d\ -f2-

-d specifies the delimeter (space), -f specifies the list of columns (all starting with the 2nd)

edited May 23, 2017 at 11:55

CommunityBot

11 silver badge

answered Jan 25, 2012 at 14:09

Joshua Goldberg

5,3853 gold badges39 silver badges42 bronze badges

6 Comments

Dakatine Over a year ago

You can also use "-b" to specify the position (from the Nth character onwards).

sdaau Over a year ago

As a note, although this performs the same task as the awk version, there are line buffering issues with cut, which awk doesn't have: stackoverflow.com/questions/14360640/…

mklement0 Over a year ago

Nice and simple, but comes with a caveat: awk treats multiple adjacent space chars. as a single separator, while cut does not; also - although this is not a problem in the case at hand - cut only accepts a single, literal char. as the delimiter, whereas awk allows a regex.

Joaquin Over a year ago

Based on this: stackoverflow.com/a/39217130/8852408, is probable that this solution isn't very efficient.

Weekend Over a year ago

@mklement0 I found this SO answer should help. That is, echo "fo o ba r" | tr -s ' ' | cut -d' ' -f3-

|

Community · Accepted Answer · 2017-05-23 12:26:33Z

awk '{out=$2; for(i=3;i<=NF;i++){out=out" "$i}; print out}'

My answer is based on the one of VeeArr, but I noticed it started with a white space before it would print the second column (and the rest). As I only have 1 reputation point, I can't comment on it, so here it goes as a new answer:

start with "out" as the second column and then add all the other columns (if they exist). This goes well as long as there is a second column.

Excellent, also you removed the $ in front of the out variable which is important too.
I like this answer better than VeeArr's because if the delimiter is something other than a space (for example - a multi-char string), its easy to get this script to put back the delimiter and in VeeArr's answer, if I try to do that, the delimiter is also added at the end, annoyingly.

3 revs, 2 users 93% user2350426 · Accepted Answer · 2024-07-05 07:08:00Z

Most solutions with awk leave a space. The options here avoid that problem.

Option 1

A simple cut solution (works only with single delimiters):

command | cut -d' ' -f3-

Option 2

Forcing an awk re-calc sometimes remove the added leading space (OFS) left by removing the first fields (works with some versions of awk):

command | awk '{ $1=$2="";$0=$0;} NF=NF'

Option 3

Printing each field formatted with printf will give more control:

$ in=' 1 2 3 4 5 6 7 8 ' $ echo "$in"|awk -v n=2 '{ for(i=n+1;i<=NF;i++) printf("%s%s",$i,i==NF?RS:OFS);}' 3 4 5 6 7 8

However, all previous answers change all repeated FS between fields to OFS. Let's build a couple of options that do not do that.

Option 4 (recommended)

A loop with sub to remove fields and delimiters at the front.

And using the value of FS instead of space (which could be changed).

This is more portable, and doesn't trigger a change of FS to OFS: NOTE: The ^[FS]* is to accept an input with leading spaces.

$ in=' 1 2 3 4 5 6 7 8 ' $ echo "$in" | awk '{ n=2; a="^["FS"]*[^"FS"]+["FS"]+"; for(i=1;i<=n;i++) sub( a , "" , $0 ) } 1 ' 3 4 5 6 7 8

Option 5

It is quite possible to build a solution that does not add extra (leading or trailing) whitespace, and preserves existing whitespace(s) using the function gensub from GNU awk, as this:

$ echo ' 1 2 3 4 5 6 7 8 ' | awk -v n=2 'BEGIN{ a="^["FS"]*"; b="([^"FS"]+["FS"]+)"; c="{"n"}"; } { print(gensub(a""b""c,"",1)); }' 3 4 5 6 7 8

It also may be used to swap a group of fields given a count n:

$ echo ' 1 2 3 4 5 6 7 8 ' | awk -v n=2 'BEGIN{ a="^["FS"]*"; b="([^"FS"]+["FS"]+)"; c="{"n"}"; } { d=gensub(a""b""c,"",1); e=gensub("^(.*)"d,"\\1",1,$0); print("|"d"|","!"e"!"); }' |3 4 5 6 7 8 | ! 1 2 !

Of course, in such case, the OFS is used to separate both parts of the line, and the trailing whitespace of the fields is still printed.

NOTE: [FS]* is used to allow leading spaces in the input line.

While options 4 and 5 are on the right track, they only work if FS is the default value of " " since the regexps are designed to skip leading occurrences of the FS but that would be a bug if the FS was any other single character, e.g. ,, and you can't negate a multi-char FS in a bracket expression (e.g. trying to do "^["FS"]"` when FS="foo") so using FS in the construction of the regexp isn't useful and is misleading.

koullislp · Accepted Answer · 2013-06-12 10:31:45Z

I personally tried all the answers mentioned above, but most of them were a bit complex or just not right. The easiest way to do it from my point of view is:

awk -F" " '{ for (i=4; i<=NF; i++) print $i }'

Where -F" " defines the delimiter for awk to use. In my case is the whitespace, which is also the default delimiter for awk. This means that -F" " can be ignored.
Where NF defines the total number of fields/columns. Therefore the loop will begin from the 4th field up to the last field/column.
Where $N retrieves the value of the Nth field. Therefore print $i will print the current field/column based based on the loop count.

nothing stops you appending this at the end :-) ` | tr '\n' ' ' `
A bit late but awk '{ for (i = 5; i <= NF; i++) { printf "%s ", $i } }'

ajendrex · Accepted Answer · 2017-10-12 02:55:34Z

awk '{ for(i=3; i<=NF; ++i) printf $i""FS; print "" }'

lauhub proposed this correct, simple and fast solution here

Community · Accepted Answer · 2017-04-13 12:36:24Z

This was irritating me so much, I sat down and wrote a cut-like field specification parser, tested with GNU Awk 3.1.7.

First, create a new Awk library script called pfcut, with e.g.

sudo nano /usr/share/awk/pfcut

Then, paste in the script below, and save. After that, this is how the usage looks like:

$ echo "t1 t2 t3 t4 t5 t6 t7" | awk -f pfcut --source '/^/ { pfcut("-4"); }' t1 t2 t3 t4 $ echo "t1 t2 t3 t4 t5 t6 t7" | awk -f pfcut --source '/^/ { pfcut("2-"); }' t2 t3 t4 t5 t6 t7 $ echo "t1 t2 t3 t4 t5 t6 t7" | awk -f pfcut --source '/^/ { pfcut("-2,4,6-"); }' t1 t2 t4 t6 t7

To avoid typing all that, I guess the best one can do (see otherwise Automatically load a user function at startup with awk? - Unix & Linux Stack Exchange) is add an alias to ~/.bashrc; e.g. with:

$ echo "alias awk-pfcut='awk -f pfcut --source'" >> ~/.bashrc $ source ~/.bashrc # refresh bash aliases

... then you can just call:

$ echo "t1 t2 t3 t4 t5 t6 t7" | awk-pfcut '/^/ { pfcut("-2,4,6-"); }' t1 t2 t4 t6 t7

Here is the source of the pfcut script:

# pfcut - print fields like cut # # sdaau, GNU GPL # Nov, 2013 function spfcut(formatstring) { # parse format string numsplitscomma = split(formatstring, fsa, ","); numspecparts = 0; split("", parts); # clear/initialize array (for e.g. `tail` piping into `awk`) for(i=1;i<=numsplitscomma;i++) { commapart=fsa[i]; numsplitsminus = split(fsa[i], cpa, "-"); # assume here a range is always just two parts: "a-b" # also assume user has already sorted the ranges #print numsplitsminus, cpa[1], cpa[2]; # debug if(numsplitsminus==2) { if ((cpa[1]) == "") cpa[1] = 1; if ((cpa[2]) == "") cpa[2] = NF; for(j=cpa[1];j<=cpa[2];j++) { parts[numspecparts++] = j; } } else parts[numspecparts++] = commapart; } n=asort(parts); outs=""; for(i=1;i<=n;i++) { outs = outs sprintf("%s%s", $parts[i], (i==n)?"":OFS); #print(i, parts[i]); # debug } return outs; } function pfcut(formatstring) { print spfcut(formatstring); }

@roblogic : unix cut is fine for small tasks like a few megs. Maybe low hundreds of MBs is probably the crossover point where cut is too slow for the volumes indeed, and where awk truly shines.

mklement0 · Accepted Answer · 2014-01-21 15:35:28Z

8

Would this work?

awk '{print substr($0,length($1)+1);}' < file

It leaves some whitespace in front though.

edited Jan 21, 2014 at 15:35

mklement0

452k68 gold badges727 silver badges986 bronze badges

answered Jun 2, 2010 at 22:08

whaley

16.3k10 gold badges60 silver badges68 bronze badges

Comments

savvadia · Accepted Answer · 2012-11-10 21:16:23Z

5

Printing out columns starting from #2 (the output will have no trailing space in the beginning):

ls -l | awk '{sub(/[^ ]+ /, ""); print $0}'

answered Nov 10, 2012 at 21:16

savvadia

1913 silver badges5 bronze badges

1 Comment

mklement0 Over a year ago

Nice, though you should add + after the space, since the fields may be separated by more than 1 space (awk treats multiple adjacent spaces as a single separator). Also, awk will ignore leading spaces, so you should start the regex with ^[ ]*. With space as the separator you could even generalize the solution; e.g., the following returns everything from the 3rd field: awk '{sub(/^[ ]*([^ ]+ +){2}/, ""); print $0}' It gets trickier with arbitrary field separators, though.

Birei · Accepted Answer · 2012-07-08 21:13:48Z

4

echo "1 2 3 4 5 6" | awk '{ $NF = ""; print $0}'

this one uses awk to print all except the last field

edited Jul 8, 2012 at 21:13

Birei

36.4k3 gold badges80 silver badges84 bronze badges

answered Jul 6, 2012 at 16:00

Kaushal Jha

411 bronze badge

1 Comment

RARE Kpop Manifesto Over a year ago

@Birei : you can shorten that to barely just : awk -- --NF

Manuel Parra · Accepted Answer · 2012-10-12 12:29:18Z

This is what I preferred from all the recommendations:

Printing from the 6th to last column.

ls -lthr | awk '{out=$6; for(i=7;i<=NF;i++){out=out" "$i}; print out}'

or

ls -lthr | awk '{ORS=" "; for(i=6;i<=NF;i++) print $i;print "\n"}'

Ed Morton · Accepted Answer · 2022-01-09 14:27:33Z

All of the other answers given here and in linked questions fail in various ways given various possible FS values. Some leave leading and/or trailing white space, some convert every FS to the OFS, some rely on semantics that only apply when FS is the default value, some rely on negating FS in a bracket expression which will fail given a multi-char FS, etc.

To do this robustly for any FS, use GNU awk for the 4th arg to split():

$ cat tst.awk { split($0,flds,FS,seps) for ( i=n; i<=NF; i++ ) { printf "%s%s", flds[i], seps[i] } print "" }

$ printf 'a b c d\n' | awk -v n=3 -f tst.awk c d $ printf ' a b c d\n' | awk -v n=3 -f tst.awk c d $ printf ' a b c d\n' | awk -v n=3 -F'[ ]' -f tst.awk b c d $ printf ' a b c d\n' | awk -v n=3 -F'[ ]+' -f tst.awk b c d $ printf 'a###b###c###d\n' | awk -v n=3 -F'###' -f tst.awk c###d $ printf '###a###b###c###d\n' | awk -v n=3 -F'###' -f tst.awk b###c###d

Note that I'm using split() above because it's 3rg arg is a field separator, not just a regexp like the 2nd arg to match(). The difference is that field separators have additional semantics to regexps such as skipping leading and/or trailing blanks when the separator is a single blank char - if you wanted to use a while(match()) loop or any form of *sub() to emulate the above then you'd need to write code to implement those semantics whereas split() already implements them for you.

I159 · Accepted Answer · 2015-02-10 11:48:18Z

If you need specific columns printed with arbitrary delimeter:

awk '{print $3 " " $4}'

col#3 col#4

awk '{print $3 "anything" $4}'

col#3anythingcol#4

So if you have whitespace in a column it will be two columns, but you can connect it with any delimiter or without it.

Chris Koknat · Accepted Answer · 2015-10-08 21:36:53Z

Perl solution:

perl -lane 'splice @F,0,1; print join " ",@F' file

These command-line options are used:

-n loop around every line of the input file, do not automatically print every line
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the @F array. Defaults to splitting on whitespace
-e execute the perl code

splice @F,0,1 cleanly removes column 0 from the @F array

join " ",@F joins the elements of the @F array, using a space in-between each element

Python solution:

python -c "import sys;[sys.stdout.write(' '.join(line.split()[1:]) + '\n') for line in sys.stdin]" < file

PlasmaBinturong · Accepted Answer · 2021-04-12 16:20:24Z

I want to extend the proposed answers to the situation where fields are delimited by possibly several whitespaces –the reason why the OP is not using cut I suppose.

I know the OP asked about awk, but a sed approach would work here (example with printing columns from the 5th to the last):

pure sed approach
```
 sed -r 's/^\s*(\S+\s+){4}//' somefile 
```
Explanation:
- s/// is the standard command to perform substitution
- ^\s* matches any consecutive whitespace at the beginning of the line
- \S+\s+ means a column of data (non-whitespace chars followed by whitespace chars)
- (){4} means the pattern is repeated 4 times.
sed and cut
```
 sed -r 's/^\s+//; s/\s+/\t/g' somefile | cut -f5- 
```
by just replacing consecutive whitespaces by a single tab;
tr and cut: tr can also be used to squeeze consecutive characters with the -s option.
```
 tr -s [:blank:] <somefile | cut -d' ' -f5- 
```

I agree sed works best for this problem. Note: The cut examples you give won't preserve consecutive spaces in the part that you're trying to extract. Consider this input: a b c d The rest. Your answer would be better if you kept only the pure sed approach. Also use -E instead of -r for portability. Also, since \s is a GNU extension, replace \s with [ \t] and replace \S with [^ \t].
sed -r is not properly portable. If you don't have -r, maybe try -E (though that's non-standard, too).

Community · Accepted Answer · 2017-05-23 11:47:30Z

If you don't want to reformat the part of the line that you don't chop off, the best solution I can think of is written in my answer in:

How to print all the columns after a particular number using awk?

It chops what is before the given field number N, and prints all the rest of the line, including field number N and maintaining the original spacing (it does not reformat). It doesn't mater if the string of the field appears also somewhere else in the line.

Define a function:

fromField () { awk -v m="\x01" -v N="$1" '{$N=m$N; print substr($0,index($0,m)+1)}' }

And use it like this:

$ echo " bat bi iru lau bost " | fromField 3 iru lau bost $ echo " bat bi iru lau bost " | fromField 2 bi iru lau bost

Output maintains everything, including trailing spaces

In you particular case:

svn status | grep '\!' | fromField 2 > removedProjs

If your file/stream does not contain new-line characters in the middle of the lines (you could be using a different Record Separator), you can use:

awk -v m="\x0a" -v N="3" '{$N=m$N ;print substr($0, index($0,m)+1)}'

The first case will fail only in files/streams that contain the rare hexadecimal char number 1

wonder.mice · Accepted Answer · 2017-09-29 18:19:48Z

This awk function returns substring of $0 that includes fields from begin to end:

function fields(begin, end, b, e, p, i) { b = 0; e = 0; p = 0; for (i = 1; i <= NF; ++i) { if (begin == i) { b = p; } p += length($i); e = p; if (end == i) { break; } p += length(FS); } return substr($0, b + 1, e - b); }

To get everything starting from field 3:

tail = fields(3);

To get section of $0 that covers fields 3 to 5:

middle = fields(3, 5);

b, e, p, i nonsense in function parameter list is just an awk way of declaring local variables.

This is a nice general-purpose function, but it breaks if you have mutliple separators between fields, since awk collapses multiple separators into one, but you're only adding one FS when accounting for the position.
@cincodenada : the "multiple collapse" part is only active by default when FS is the default " " ( 0x20 \40 ), which is a special case. the collapsing wld only happen if you made it "[ ]+", plus the fact any custom FS turns off pre-trimming of the leading/trailing edges, so an input of " ABC " (2 spaces 3 ASCII letters and 1 more space) against an FS = "[ ]+" results in NF == 3 : only $2 is populated with ABC while $1 and $3` are empty fields right off the bat. With FS = "[ ]" that input becomes NF==4 / $3 == "ABC"

kenorb · Accepted Answer · 2016-04-10 02:40:23Z

Awk examples looks complex here, here is simple Bash shell syntax:

command | while read -a cols; do echo ${cols[@]:1}; done

Where 1 is your nth column counting from 0.

Example

Given this content of file (in.txt):

c1 c1 c2 c1 c2 c3 c1 c2 c3 c4 c1 c2 c3 c4 c5

here is the output:

$ while read -a cols; do echo ${cols[@]:1}; done < in.txt c2 c2 c3 c2 c3 c4 c2 c3 c4 c5

I also came to conclusion that that's not a good use case for awk. Your solution might come in handy in some cases, but it doesn't preserve consecutive spaces. A pure sed solution looks optimal to me. Additionally it doesn't need bash.
Bash while read loop extremely slow compared to cat, why? and When to wrap quotes around a shell variable?

Zombo · Accepted Answer · 2017-02-11 04:14:08Z

This would work if you are using Bash and you could use as many 'x ' as elements you wish to discard and it ignores multiple spaces if they are not escaped.

while read x b; do echo "$b"; done < filename

Bash while read loop extremely slow compared to cat, why?

Zombo · Accepted Answer · 2017-02-11 04:17:47Z

0

Perl:

@m=`ls -ltr dir | grep ^d | awk '{print \$6,\$7,\$8,\$9}'`; foreach $i (@m) { print "$i\n"; }

edited Feb 11, 2017 at 4:17

Zombo

1

answered Oct 17, 2013 at 12:00

pkm

2,7833 gold badges34 silver badges46 bronze badges

1 Comment

Chris Davies Over a year ago

This doesn't answer the question, which generalises the requirement to printing from the Nth column to the end.

RARE Kpop Manifesto · Accepted Answer · 2022-06-30 05:33:57Z

UPDATE :

if you wanna use no function calls at all while preserving the spaces and tabs in between the remaining fields, then do :

echo " 1 2 33 4444 555555 \t6666666 " | {m,g}awk ++NF FS='^[ \t]*[^ \t]*[ \t]+|[ \t]+$' OFS=

=

2 33 4444 555555 6666666

===================

You can make it a lot more straight forward :

 svn status | [m/g]awk '/!/*sub("^[^ \t]*[ \t]+",_)' svn status | [n]awk '(/!/)*sub("^[^ \t]*[ \t]+",_)'

Automatically takes care of the grep earlier in the pipe, as well as trimming out extra FS after blanking out $1, with the added bonus of leaving rest of the original input untouched instead of having tabs overwritten with spaces (unless that's the desired effect)

If you're very certain $1 does not contain special characters that need regex escaping, then it's even easier :

mawk '/!/*sub($!_"[ \t]+",_)' gawk -c/P/e '/!/*sub($!_"""[ \t]+",_)'

Or if you prefer customizing FS+OFS to handle it all :

mawk 'NF*=/!/' FS='^[^ \t]*[ \t]+' OFS='' # this version uses OFS

RARE Kpop Manifesto · Accepted Answer · 2022-06-30 20:34:51Z

This should be a reasonably comprehensive awk-field-sub-string-extraction function that

returns substring of $0 based on input ranges, inclusive
clamp in out of range values,
handle variable length field SEPs
has speedup treatments for ::

completely no inputs, returning $0 directly

input values resulting in guaranteed empty string ("")

FROM-field == 1

FS = "" that has split $0 out by individual chars (so the FROM <(_)> and TO <(__)> fields behave like cut -c rather than cut -f)

original $0 restored, w/o overwriting FS seps with OFS

|

 {m,g}awk '{ 2 print "\n|---BEFORE-------------------------\n" 3 ($0) "\n|----------------------------\n\n [" 4 fld2(2, 5) "]\n [" fld2(3) "]\n [" fld2(4, 2) 5 "]<----------------------------------------------should be 6 empty\n [" fld2(3, 11) "]<------------------------should be 7 capped by NF\n [" fld2() "]\n [" fld2((OFS=FS="")*($0=$0)+11, 8 23) "]<-------------------FS=\"\", split by chars 9 \n\n|---AFTER-------------------------\n" ($0) 10 "\n|----------------------------" 11 } 12 function fld2(_,__,___,____,_____) 13 { if (+__==(_=-_<+_ ?+_:_<_) || (___=____="")==__ || !NF) { return $_ 16 } else if (NF<_ || (__=NF<+__?NF:+__)<(_=+_?_:!_)) { return ___ 18 } else if (___==FS || _==!___) { 19 return ___<FS \ ? substr("",$!_=$!_ substr("",__=$!(NF=__)))__ 20 : substr($(_<_),_,__) 21 } 22 _____=$+(____=___="\37\36\35\32\31\30\27\26\25"\ "\24\23\21\20\17\16\6\5\4\3\2\1") 23 NF=__ 24 if ($(!_)~("["(___)"]")) { 25 gsub("..","\\&&",___) + gsub(".",___,____) 27 ___=____ 28 } 29 __=(_) substr("",_+=_^=_<_) 30 while(___!="") { 31 if ($(!_)!~(____=substr(___,--_,++_))) { 32 ___=____ 33 break } 35 ___=substr(___,_+_^(!_)) 36 } 37 return \ substr("",($__=___ $__)==(__=substr($!_, _+index($!_,___))),_*($!_=_____))(__) }'

those <TAB> are actual \t \011 but relabeled for display clarity

|---BEFORE------------------------- 1 2 33 4444 555555 <TAB>6666666 |---------------------------- [2 33 4444 555555] [33] []<---------------------------------------------- should be empty [33 4444 555555 6666666]<------------------------ should be capped by NF [ 1 2 33 4444 555555 <TAB>6666666 ] [ 2 33 4444 555555 <TAB>66]<------------------- FS="", split by chars |---AFTER------------------------- 1 2 33 4444 555555 <TAB>6666666 |----------------------------

user2314737 · Accepted Answer · 2024-11-21 10:09:36Z

Assuming that you have comma as a field separator

Remove first 2 fields by looping over the first NF-2 fields for (i=1; i<=NF-2; i++)

echo "a,b,c,d\ne,f,g,h,i" | awk -F, '{for (i=1; i<=NF-2; i++) {$i=$(i+2);printf "%s,", $i}print ""}' | sed 's/,$//'

Input

a,b,c,d e,f,g,h,i

becomes

c,d g,h,i

In general, to skip the first $N fields

N=3 echo "a,b,c,d\ne,f,g,h,i" | awk -F, -vn=$N '{for (i=1; i<=NF-n; i++) {$i=$(i+n);printf "%s,", $i}print ""}' | sed 's/,$//'

Mario Palumbo · Accepted Answer · 2025-04-28 09:44:53Z

This is a new method I made, a function created entirely with Perl Compatible Regular Expressions v. 2:

reinstall-readonlyx () { sudo apt-get update -y sudo apt-get purge -y libconst-fast-perl libreadonly-tiny-perl libreadonly-perl libreadonlyx-perl && sudo apt-get install -y libreadonlyx-perl return $? } reinstall-readonlyx pcre2 () { local readonlyx='' local p='p' local s='--' readonlyx='use ReadonlyX; Readonly::Array ' if [[ $1 = -i && $2 = -n ]]; then set -- "$2" "$1" "${@:3}"; fi if [[ $1 = -n ]]; then p='n' set -- "${@:2}" fi if [[ $1 = -i ]]; then s="$1 $s" set -- "${@:2}" fi perl -0777s${p}E "${readonlyx}"'my @ARGS; BEGIN{@ARGS = @ARGV} END{foreach(@ARGS){if($_ ne "-" && ! -f $_){exit 2}}} '"$1" "$s" "${@:2}" return $? }

In the created function, @ARGS is an array reserved for checking the existence of the files that were passed as arguments. Using the ReadonlyX library, thus making the array read-only, prevents the end user from altering the array during the function call, which would otherwise lose the functionality of returning an error code in case of non-existent files.

Comment out the line like this: #readonlyx='use ReadonlyX; Readonly::Array ' if you don't want to use ReadonlyX anyway.

pcre2-cut () { local function_name="${FUNCNAME[0]}" local i=0 local int='^[+-]*[0-9]+$' local start=1 local num=-1 local delims='\h' local max=false local count=false local expr='[^$d\v]+(?:[$d]+|$)(?=[^$d\v]|$)' local options='say' local param='' local newline='' for (( i=1; i<=$#; i++ )); do param="${!i}" case $param in -- | -s | -n | -d | -m | -c) set -- "${@:1:i-1}" "${@:i+1}" case $param in -s | -n) if [[ ! ${!i} =~ $int ]]; then echo "$function_name: Invalid argument for $param." >&2 return 1 fi ;; esac case $param in --) break;; -s) start="${!i}";; -n) num="${!i}";; -d) delims="${!i}";; -m) max=true;; -c) count=true;; esac case $param in -s | -n | -d) set -- "${@:1:i-1}" "${@:i+1}";; esac ((i--)) ;; esac done ((start+=0)) ((num+=0)) if [[ $start -gt 0 ]]; then ((start--)) else ((start+="$(pcre2-cut -d "$delims" -m -- "$@")")) if [[ $start -lt 0 ]]; then start=0; fi fi if $max || $count; then options='my @c = (); do{push(@c, 0+(() = /'"$expr"'/gm))} for split(/\v/, $_, -1)' if $max; then options="${options}"'; use List::Util qw(max); say max @c' newline='; say ""' fi if $count; then options="${options}${newline}"'; say join("\n", @c)' fi fi pcre2 's/\V\K\z/\n/' -- "$@" | pcre2 's/\v\z//' -- - | if [[ $num -lt 0 ]]; then pcre2 -n 's/^[$d]*((?:'"${expr}"'){$s})?(?(1)(.*?)|.*?)[$d]*$/$2/gm; '"$options" -s="${start}" -d="${delims}" -- - else pcre2 -n 's/^[$d]*((?:'"${expr}"'){$s})?(?(1)((?:[$d]*[^$d\v]+(?=[$d]|$)){$n})?(?(2).*?|(.*?))|.*?)[$d]*$/$2$3/gm; '"$options" -s="${start}" -n="${num}" -d="${delims}" -- - fi return $? }

Usage:

pcre2-cut -s <starting column> -n <number of columns> -d <delimiter(s)>

This function only works with data passed via standard input.

1° parameter: Only integers strictly greater than 0 are allowed for the starting column; if the starting column exceeds the number of columns present, no columns will be taken.

2° parameter: If the number of columns is less than 0, all columns starting from the starting column will be taken; if the number of columns is equal to 0, no columns will be taken; if the number of columns is greater than 0, columns from the starting column will be taken, plus the indicated number of columns. If omitted, the default number of columns is -1.

3° parameter: One or more delimiters are allowed, including all escape characters allowed by Perl regular expressions. This is a very powerful parameter.
If omitted, the default delimiter is \h.

\v Matches unicode vertical whitespace, considered a character class by the PCRE engine:

[\x{2028}\n\r\x{000B}\f\x{2029}\x{0085}]

\h Matches spaces, tabs, non-breaking/mathematical/ideographic spaces, and so on. Works with Unicode. Equivalent to:

[ \t\x{00A0}\x{1680}\x{180E}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}\x{2006}\x{2007}\x{2008}\x{2009}\x{200A}\x{202F}\x{205F}\x{3000}]

The second and third parameters are interchangeable if the second parameter is not an integer (positive or negative) and thanks to this feature, it is also possible to indicate only the third parameter (the delimiter) omitting the second one.

When obtaining columns, all consecutive delimiter characters will be skipped (unlike the cut command), and delimiter characters (even if consecutive) at the beginning and end of a line will be truncated, regardless of the parameters entered.

fungusakafungus · Accepted Answer · 2025-09-05 20:16:03Z

You'll have to delete the starting fields(could do it in a loop as well), and rebuild the record:

$ echo 1 2 3 4 5 6 | gawk '{$1=$2="";$0=$0;$1=$1;print}' 3 4 5 6

Repeat the $1=$2=...=""; pattern for as many fields as you want to skip.

$0=$0 renumbers the fields I believe, as $1 and $2 no longer exist, the $3 becomes $1;

$1=$1 is the documented way of rebuilding the record (so that the whitespace at the beginning goes away): https://www.gnu.org/software/gawk/manual/html_node/Changing-Fields.html

print prints the new record.

It does mangle your whitespace, but works as requested.

Only tested with GNU awk.

Community · Accepted Answer · 2017-05-23 10:31:38Z

I wasn't happy with any of the awk solutions presented here because I wanted to extract the first few columns and then print the rest, so I turned to perl instead. The following code extracts the first two columns, and displays the rest as is:

echo -e "a b c d\te\t\tf g" | \ perl -ne 'my @f = split /\s+/, $_, 3; printf "first: %s second: %s rest: %s", @f;'

The advantage compared to the perl solution from Chris Koknat is that really only the first n elements are split off from the input string; the rest of the string isn't split at all and therefor stays completely intact. My example demonstrates this with a mix of spaces and tabs.

To change the amount of columns that should be extracted, replace the 3 in the example with n+1.

Community · Accepted Answer · 2017-05-23 12:10:48Z

ls -la | awk '{o=$1" "$3; for (i=5; i<=NF; i++) o=o" "$i; print o }'

from this answer is not bad but the natural spacing is gone.
Please then compare it to this one:

ls -la | cut -d\ -f4-

Then you'd see the difference.

Even ls -la | awk '{$1=$2=""; print}' which is based on the answer voted best thus far is not preserve the formatting.

Thus I would use the following, and it also allows explicit selective columns in the beginning:

ls -la | cut -d\ -f1,4-

Note that every space counts for columns too, so for instance in the below, columns 1 and 3 are empty, 2 is INFO and 4 is:

$ echo " INFO 2014-10-11 10:16:19 main " | cut -d\ -f1,3 $ echo " INFO 2014-10-11 10:16:19 main " | cut -d\ -f2,4 INFO 2014-10-11 $

Zombo · Accepted Answer · 2017-02-11 04:13:41Z

If you want formatted text, chain your commands with echo and use $0 to print the last field.

Example:

for i in {8..11}; do s1="$i" s2="str$i" s3="str with spaces $i" echo -n "$s1 $s2" | awk '{printf "|%3d|%6s",$1,$2}' echo -en "$s3" | awk '{printf "|%-19s|\n", $0}' done

Prints:

| 8| str8|str with spaces 8 | | 9| str9|str with spaces 9 | | 10| str10|str with spaces 10 | | 11| str11|str with spaces 11 |

RARE Kpop Manifesto · Accepted Answer · 2023-03-23 19:13:09Z

__=' 1 2 3 4 5 6 7 8 '

printf '%s' "$__" | od

0000000 538976288 538976305 538980896 538976307 1 2 3 040 040 040 040 061 040 040 040 040 062 040 040 063 040 040 040 sp sp sp sp 1 sp sp sp sp 2 sp sp 3 sp sp sp 32 32 32 32 49 32 32 32 32 50 32 32 51 32 32 32 20 20 20 20 31 20 20 20 20 32 20 20 33 20 20 20 0000020 540287008 540352544 540418080 538976311 4 5 6 7 040 040 064 040 040 040 065 040 040 040 066 040 067 040 040 040 sp sp 4 sp sp sp 5 sp sp sp 6 sp 7 sp sp sp 32 32 52 32 32 32 53 32 32 32 54 32 55 32 32 32 20 20 34 20 20 20 35 20 20 20 36 20 37 20 20 20 0000040 540549152 32 8 040 040 070 040 040 sp sp 8 sp sp 32 32 56 32 32 20 20 38 20 20

printf '\42%s\42' "$__"

" 1 2 3 4 5 6 7 8 "

mawk ++NF FS='^[ \t]*[^ \t]+[ \t]+' OFS='"'

"2 3 4 5 6 7 8 "

This approach preserves all multi-blank seps between fields by specifically targeting just the head.

Collectives™ on Stack Overflow

31 Answers 31

21 Comments

12 Comments

6 Comments

2 Comments

Option 1

Option 2

Option 3

Option 4 (recommended)

Option 5

1 Comment

3 Comments

Comments

2 Comments

Comments

1 Comment

1 Comment

Comments

Comments

Comments

Comments

2 Comments

Comments

2 Comments

Example

2 Comments

1 Comment

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Related