{ nl -bpH -w1 |
sed 's/^\([0-9]*\)[ \t]*\([^H]*.\)/\2\1/'
} <<\DATA
...
1562 first part
1563 H col3 H col4
1564 H col3 H col4
...
3241 H col3 H col4
3242 third part
DATA
###OUTPUT
...
1562 first part
1563 H1 col3 H col4
1564 H2 col3 H col4
...
3241 H3 col3 H col4
3242 third part
That's the fastest way I can imagine it would be done - especially with a very large file. `nl` will number only lines containing the string *H* and insert that number at the head of the line followed by a `<tab>` character. It indents all other lines with a few spaces.
`sed` is passed `nl`'s output over the `|`pipe. `sed` then replaces the following sequence:
- 0 or more digits occurring at the beginning of the line *(referenced as `\1`)*
- 0 or more `<tab>` or `<space>` characters
- 0 or more characters that are not H then one character *(referenced as `\2`)*
...with `\2\1`.
So lines not containing an *H* get this treatment:
^'' .*.$ = ^.*.''$
And those that do get this one:
^(digit)*<tab>(not H)*H.*$ = ^(not H)*H(digit)*.*$
...where `''` is an empty string.
For maximum portability you should replace the `\t` in `[ \t]` with a literal `<tab>` character.