Print rest of the fields in awk

Question

Suppose we have this data file.

john 32 maketing executive jack 41 chief technical officer jim 27 developer dela 33 assistant risk management officer

I want to print using awk

john maketing executive jack chief technical officer jim developer dela assistant risk management officer

I know it can be done using for.

awk '{printf $1; for(i=3;i<=NF;i++){printf " %s", $i} printf "\n"}' < file

Problem is its long and looks complex.

Is there any other short way to print rest of the fields.

A simple hack is to set $2 to "", then print $0 (all fields) -- though that would give you an extra delimiter for the empty field. — mkfs
– mkfs, Commented Aug 27, 2013 at 5:24
3 years after, you helped me. But you should change "<NF" to "<=NF", if not you'll skip the very last field ;) — Koreth
– Koreth, Commented Sep 22, 2017 at 9:19
3 years after that, I edited the question to change <NF to <=NF, to fix the bug @Koreth pointed out. — Katie Kilian
– Katie Kilian, Commented Jun 5, 2020 at 15:55

Community · Accepted Answer · 2017-05-23 12:17:36Z

80

Set the field(s) you want to skip to blank:

awk '{$2 = ""; print $0;}' < file_name

Source: Using awk to print all columns from the nth to the last

edited May 23, 2017 at 12:17

CommunityBot

11 silver badge

answered Aug 27, 2013 at 5:23

Barun

2,67234 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Jotne Over a year ago

Does not clean up extra space, and using unneeded print $0 that could be replaced by a simple 1

Alex Over a year ago

@Jotne When I use 1 in-place of print $0, I don't get any output from awk. You sure they're equivalent?

blakeoft Over a year ago

@Alex Remove print $0 and put 1 after the closing }.

Ed Morton · Accepted Answer · 2013-08-27 13:51:04Z

Reliably with GNU awk for gensub() when using the default FS:

$ gawk -v delNr=2 '{$0=gensub("^([[:space:]]*([^[:space:]]+[[:space:]]+){"delNr-1"})[^[:space:]]+[[:space:]]*","\\1","")}1' file john maketing executive jack chief technical officer jim developer dela assistant risk management officer

With other awks, you need to use match() and substr() instead of gensub(). Note that the variable delNr above tells awk which field you want to delete:

$ gawk -v delNr=3 '{$0=gensub("^([[:space:]]*([^[:space:]]+[[:space:]]+){"delNr-1"})[^[:space:]]+[[:space:]]*","\\1","")}1' file john 32 executive jack 41 technical officer jim 27 dela 33 risk management officer

Do not do this:

awk '{sub($2 OFS, "")}1'

as the same text that's in $2 might be at the end of $1, and/or $2 might contain RE metacharacters so there's a very good chance that you'll remove the wrong string that way.

Do not do this:

awk '{$2=""}1' file

as it adds an FS and will compress all other contiguous white space between fields into a single blank char each.

Do not do this:

awk '{$2="";sub(" "," ")}1' file

as it hasthe space-compression issue mentioned above and relies on a hard-coded FS of a single blank (the default, though, so maybe not so bad) but more importantly if there were spaces before $1 it would remove one of those instead of the space it's adding between $1 and $2.

One last thing worth mentioning is that in recent versions of gawk there is a new function named patsplit() which works like split() BUT in addition to creating an array of the fields, it also creates an array of the spaces between the fields. What that means is that you can manipulate fields and the spaces between then within the arrays so you don't have to worry about awk recompiling the record using OFS if you manipulate a field. Then you just have to print the fields you want from the arrays. See patsplit() in http://www.gnu.org/software/gawk/manual/gawk.html#String-Functions for more info.

Looking at these complications one wonders whether awk is indeed the best tool for this job. e.g. if fields are delimited by pipe or comma then whole awk code needs to be rewritten.
Depends on your input. If you have single chars between fields then cut is better. If you have anything else then gawk+gensub() or sed (very similar syntactically) might be the best options. Both of those can run into problems when trying to describe the negation of multi-char REs so then you need to look at gawk+patsplit() or gawk+FPAT. No silver bullet unfortunately.
Great answer I wish I could +2 you. One problem is the code is much longer than for loop solution. f
@shiplu.mokadd.im - correct but it preserves the original white space whereas the for loop you posted will not produce the output you specified. By the way, wrt that for loop you posted - never use printf with input data, e.g. printf $1 as that will fail spectacularly if your input data contains printf formatting characters such as %. Always use printf "%s",$1 for printing input data instead. Also to print a newline is just print "", no need for printf "\n".

anubhava · Accepted Answer · 2013-08-27 06:25:29Z

8

You can use simple awk like this:

awk '{$2=""}1' file

However this will have an extra OFS in your output that can be avoided by this awk

awk '{sub($2 OFS, "")}1' file

OR else by using this tr and cut combo:

On Linux:

tr -s ' ' < file | cut -d ' ' -f1,f3-

On OSX:

tr -s ' ' < file | cut -d ' ' -f1 -f3-

edited Aug 27, 2013 at 6:25

answered Aug 27, 2013 at 5:24

anubhava

790k67 gold badges603 silver badges671 bronze badges

12 Comments

Adrian Frühwirth Over a year ago

This should be cut -d' ' -f1,3-.

anubhava Over a year ago

@AdrianFrühwirth: Thanks but cut -f1,3- is not portable and isn't supported on my OSX.

Ed Morton Over a year ago

You shouldn't use awk '{sub($2 OFS, "")}1' since the same text that's in $2 might be at the end of $1, and/or $2 might contain RE metacharacters so there's a very good chance that you'll remove the wrong string that way.

Ed Morton Over a year ago

@anubhava - no, the only awk function that looks for strings rather than REs in another string is index().

Ed Morton Over a year ago

@anubhava - correct there's no simple way but see my answer for a robust way.

|

Jotne · Accepted Answer · 2013-08-27 05:41:20Z

4

This removes filed #2 and cleans up the extra space.

awk '{$2="";sub(" "," ")}1' file

answered Aug 27, 2013 at 5:41

Jotne

41.7k13 gold badges54 silver badges58 bronze badges

3 Comments

Shiplu Mokaddim Over a year ago

what does that extra 1 do here?

Adrian Frühwirth Over a year ago

@shiplu.mokadd.im The 1 evaluates to true which kicks in the default block ({ print $0 }).

Krzysztof Jabłoński Over a year ago

Does not clean anything, but instead like all rewrites of existing fields do - it replaces IFS (one or more in a row) into a single OFS. E.g. that is one way to implement a 'normalize spaces' filter: awk '{$1=$1}1'

konsolebox · Accepted Answer · 2013-08-27 06:41:20Z

Another way is to just use sed to replace the first digits and space match:

sed 's|[0-9]\+\s\+||' file

Zapko · Accepted Answer · 2020-10-19 01:48:49Z

Approach using awk that would not require gawk or any state mutations:

awk '{print $1 " " substr($0, index($0, $3));}' datafile

UPD

solution that is a bit longer, but will stand up the case when $1 or $2 contains $3:

awk '{print $1 " " substr($0, length($1 $2) + 1);}' data

Or even more robust if you have custom field separator:

awk '{print $1 " " substr($0, length($1 FS $2 FS) + 1);}' data

Marek Šimon · Accepted Answer · 2021-05-07 18:49:15Z

Do not use altering $n. If you have more spaces in some part you want to keep, it will reduce to one.

Collectives™ on Stack Overflow

Print rest of the fields in awk

7 Answers 7

3 Comments

4 Comments

12 Comments

3 Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

3 Comments

4 Comments

12 Comments

3 Comments

Comments

Comments

Comments

Linked

Related