9

Is there a way to stop processing a single line in awk? Is there something like break or continue that works on pattern-action pairs rather than control structures within an action?

Suppose I have the following input.txt file and I'm trying to replace each of the names with x0, x1, x2, .... However, I want to leave lines beginning with a space or - alone.

-- data bob 4 joe 5 bob 6 joe 7 

becomes:

-- data x0 4 x1 5 x0 6 x1 7 

And I have the following script that does it. (As a side note, there's probably a better way of structuring this using a heredoc rather than a massive string literal).

#!/bin/sh awk ' BEGIN { c = 0; } # do not process lines beginning with - or space /^[- ]/ { print; } # update /^[^- ]/ { if (! ($1 in name) ) { new_name = "x" c; c += 1; name[$1] = new_name; } $1 = name[$1]; print; } ' input.txt 

This script leaves a bit to be desired. First of all, we know that /^[- ]/ and /^[^- ]/ are mutually exclusive, but that property isn't enforced anywhere. I'd like to be able to use something like break to abandon processing the line after the first match.

/^[- ]/ { print; break; } 

I'd like to be able to add another clause to alert the user to a problem if there is a non-empty line that doesn't match either of the first two patterns.

/./ { print "non-empty line!" > "/dev/stderr" # or print "non-empty line!" > "/dev/tty" if portability is a concern } 

However, if I add this pattern-action pair to the script as-is it fires after every non-empty line.

Is there something I can add after the first two test cases to stop processing the line since it has been "successfully" handled? If that isn't possible, is there a common awk idiom for a catch-all case?

2
  • 5
    The next Statement? Commented Mar 20, 2017 at 0:51
  • @steeldriver ...yes. I completely missed that. Commented Mar 20, 2017 at 0:53

1 Answer 1

13

You may use the awk statement next to immediately continue with processing the next input record.

Here's an alternative implementation of your awk script:

awk '/^[- ]/ { print; next } !($1 in n) { n[$1] = sprintf("x%d", c++) } { $1 = n[$1]; print }' data.in 

The awk code is

/^[- ]/ { print; next } !($1 in n) { n[$1] = sprintf("x%d", c++) } { $1 = n[$1]; print } 

c is the counter. It will be zero from the start.

n is the associative array holding the new labels/names. It is indexed with the data from the file's first field/column.

!($1 in n) will be true if the data in the first field has not already been assigned a new label/name.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.