Extract parts of a string using head and tail only

Question

Hello I would like to know if there is a way where I can only use head, tail, and pipes (and redirection eventually) to extract and output the start, middle characters, and end of a string

Example: Given this string: SHOWpijfirefjTHISezpijSTRING, the command should output 'SHOWTHISSTRING'

I tried something like (head -c 4 mdp > /dev/tty) | (tail -c +13 mdp | head -c 4 > /dev/tty) | (tail -c 6 mdp > /dev/tty) 2>&1

but it doesn't always return the same result and can give out of order results

The arbitrary restriction to just head and tail make this sound like homework. Read man head tail . — waltinator
– waltinator, Commented Oct 22, 2023 at 21:25
Piping processes together makes them all run simultaneously, and is pointless, as none of the subshells uses any stdin or stdout anyway. That is why you get data out of order. Writing /dev/tty from 3 different places is not going to help. The 13 is wrong: byte counts are from zero, but start positions start from 1. But generally, the head and tail ore OK -- the redirections, subshells and pipes are the issue. — Paul_Pedant
– Paul_Pedant, Commented Oct 22, 2023 at 22:29

Stéphane Chazelas · Accepted Answer · 2023-10-23 07:31:21Z

You can with the head implementations that support that non-standard -c option and are smart enough not to read more than needed from their input when asking to output a fixed number of bytes:

string='SHOWpijfirefjTHISezpijSTRING' printf %s "$string" | { head -c 4 head -c 9 > /dev/null head -c 4 tail -c 6 }

See how it works with GNU head or the head builtin of ksh93, but not the head of busybox. If you run strace -fe read busybox sh ./that-script, you'll see:

[pid 7739] read(0, "SHOWpijfirefjTHISezpijSTRING", 4096) = 28

Where the first head consumes all the input by reading a full block and outputs 4 bytes of it, leaving nothing to read for the next head and tail commands.

While with GNU or ksh93 head:

[pid 10293] read(0, "SHOW", 4) = 4

Also beware head and tail with -c work with bytes, not characters¹ so can only be used to report fix numbers of characters with text encoded with one byte per character.

Most modern shells will have builtin operators to slice strings based on character position.

For instance, in zsh or yash:

slice=${string[1,4]}${string[14,17]}${string[-4,-1]}

In zsh, can be shortened to:

slice=$string[1,4]$string[14,17]$string[-6,-1]

And you can cut off bits in place:

$ string[5,13]= string[9,-7]= $ print -r -- $string SHOWTHISSTRING

Or with recent versions of ksh93, bash, zsh or mksh:

slice=${string:0:4}${string:13:4}${string: -6}

POSIXly:

tmp=${string#?????????????} slice=${string%"${string#????}"}${tmp%"${tmp#????}"}${string#"${string%????}"}

^{¹ despite what c may suggest. That -c option was added to some head implementations to align with the -c of tail, and the -c was added to tail long before the notion that a character could be made of more than one byte.}

terdon · Accepted Answer · 2023-10-23 09:49:24Z

Assuming your head and tail support the -c option, that your string is nothing but simple ASCII with no multi-byte characters, and that your shell supports the <<< here string construct (bash and zsh do), you can do:

$ string=SHOWpijfirefjTHISezpijSTRING $ printf '%s%s%s\n' "$(head -c 4 <<<"$string")" \ "$(head -c 17 <<<"$string" | tail -c 4)" \ "$( tail -c 5 <<<"$string")" SHOWTHISRING

Or, if you don't want printf:

$ string=SHOWpijfirefjTHISezpijSTRING $ head -c 4 <<<"$string"; \ head -c 17 <<<"$string" | tail -c 4; \ tail -c 5 <<<"$string"; SHOWTHISRING

Stack Exchange Network

Extract parts of a string using head and tail only

2 Answers 2

You must log in to answer this question.

Linked

Hot Network Questions

Extract parts of a string using head and tail only

2 Answers 2

You must log in to answer this question.

Linked

Related

Hot Network Questions