3

I was under the impression from the POSIX specs for sed that it is necessary to left-align the text on the line following the i\ command, unless you want leading whitespace in the output.

A quick test on my Mac (using BSD sed) shows that perhaps this is not necessary:

$ cat test.sed #!/bin/sed -f i\ This line starts with spaces. $ echo some text | sed -f test.sed This line starts with spaces. some text $ 

However, I can't seem to find this documented anywhere. It's not in the POSIX specs, and it's not even in sed's man page on my system.

Can I rely on this behavior in sed scripts which I want to be portable? How portable is it?

(Is it documented anywhere?)


(Bonus question: Is it even possible to force sed to insert whitespace at the beginning of a fixed line passed to i\?)

1
  • @don_crissti, you are correct, that works. (Only the first blank needs to be escaped.) And I see that this agrees with POSIX specs: "The argument text shall consist of one or more lines. Each embedded <newline> in the text shall be preceded by a <backslash>. Other <backslash> characters in text shall be removed, and the following character shall be treated literally." Commented Oct 29, 2016 at 0:11

3 Answers 3

4

No but your script will be portable as long as you escape any leading blank. Why ? Because some seds strip blank characters from text lines and the only way to avoid that is to escape the leading blank, as these manual pages dating back from the last century explain: 1, 2, 3
The same goes for BSD sed (OSX just copied the code, it's not their extension) and if you check the archives and read the man page from BSD 2.11 it's pretty clear:

(1)i\
text
.......
An argument denoted text consists of one or more lines, all but the last of which end with '\' to hide the newline. Backslashes in text are treated like backslashes in the replacement string of an 's' command, and may be used to protect initial blanks and tabs against the stripping that is done on every script line.

Now, where is this documented in the POSIX spec ? It only says

The argument text shall consist of one or more lines. Each embedded <newline> in the text shall be preceded by a <backslash>. Other <backslash> characters in text shall be removed, and the following character shall be treated literally.

and if you scroll down under RATIONALE it says

The requirements for acceptance of <blank> and <space> characters in command lines has been made more explicit than in early proposals to describe clearly the historical practice and to remove confusion about the phrase "protect initial blanks [sic] and tabs from the stripping that is done on every script line" that appears in much of the historical documentation of the sed utility description of text. (Not all implementations are known to have stripped <blank> characters from text lines, although they all have allowed leading <blank> characters preceding the address on a command line.)

Since the part with "backslashes may be used to" was not included in that quote, the remaining phrase "protect initial blanks..." doesn't make any sense...1


Anyway, to sum up: some implementations did (and some still do) strip blanks from text lines. However, since the POSIX spec to which all implementations should comply says

Other <backslash> characters in text shall be removed, and the following character shall be treated literally.

we can conclude that the portable way to indent the lines in the text-to-be-inserted is by escaping the leading blank on each of those lines.


1: I also don't understand why OSX/BSD people have changed the entire paragraph in the man page without altering the source code - you get the same behavior as before but the man section that documents this stuff is no longer there.

1
  • 1
    The Other <backslash> characters in text shall be removed, and the following character shall be treated literally is not true of GNU sed (even with POSIXLY_CORRECT) where sequences like \n are expanded. Commented Sep 7, 2017 at 7:48
3

It's OSX sed extension, not standard behavior. You can see this link, in function compile_text:

/* * Compile the text following an a or i command. */ static char * compile_text() { int asize, size; char *text, *p, *op, *s; char lbuf[_POSIX2_LINE_MAX + 1]; asize = 2 * _POSIX2_LINE_MAX + 1; text = xmalloc(asize); size = 0; while (cu_fgets(lbuf, sizeof(lbuf))) { op = s = text + size; p = lbuf; EATSPACE(); for (; *p; p++) { if (*p == '\\') p++; *s++ = *p; } size 

They ate spaces using EATSPACE macro.

In FreeBSD sed, which can incorrectly treat \ as line continuation characters when using a, i, c, the behavior is weirder. In my FreeBSD 9.3:

$ echo 1 | sed -e 'i\ 1' ": extra characters after \ at the end of i command 

but:

$ echo 1 | sed -e 'i\ 2' 2 1 

works, and it also eats spaces.

GNU sed, the heirloom sed doesn't have this problem.

1
  • Wow, so it looks like if you want to be certain of what you are getting in a portable sed script, just never ever use leading spaces after an i or an a command. Commented Sep 27, 2016 at 3:59
2

Cuonglm gave the best answer, but for the record here's what GNU sed does:

echo foo | sed 'i\ This line starts with spaces.' 

Output:

 This line starts with spaces. foo 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.