Skip to main content
4 of 8
added 2 characters in body
mikeserv
  • 59.4k
  • 10
  • 122
  • 242

As far as I know, there are only two cases in which it is necessary to double-quote expansions, and those cases involve the two special shell parameters "$@" and "$*" - which are specified to expand differently when enclosed in double-quotes. In all other cases (exluding, perhaps, shell-specific array implementations) the behavior of an expansion is a configurable thing - there are options for that.

This is not to say, of course, that double-quoting should be avoided - to the contrary, it is probably the most convenient and robust method of delimiting an expansion which the shell has to offer. But, I think, as alternatives have already been expertly expounded, this an excellent place to discuss what happens when the shell expands a value.

In the first place, there is $IFS. $IFS is an always present - and always settable - shell parameter that defines how the shell should handle field-splitting. The shell, in its heart and soul (for those that have such), is a command-interpreter - it is a parser, like a big, interactive, sed. Its job is to accept input, interpret and split it correctly into actionable input words, and then - in most cases - pass on the results of its interpretation. $IFS defines how this is done for the shell's expansions that undergo field-splitting.

People often say the shell is a glue, and, if this is true, then what it is sticking is lists of arguments - or fields - to one process or another when it execs them. These fields are the results of its interpretation - and any subsequent expansions - of its input tokens which are before expansion words, and after fields - per shell input word - such as $word. words always evaluate to a single field - except where those input words are expanded into multiple fields only as affected by either "$@", field-splitting on $IFS or pathname expansion - both of which are discussed below.

Most shells do not handle the NUL byte well - if at all - and this is because they're already splitting on it. The shell has to exec a lot and when it does it does with a NUL delimited array of arguments that it hands to the system kernel at exec time. If you were to intermingle the shell's delimiter with its delimited data then the shell would probably screw it up. Its internal data structures - like most programs - rely on that delimiter. zsh, notably, does not screw this up.

And that's where $IFS comes in. $IFS splits shell expansions on delimiters other than NUL - or, in other words it substitutes certain bytes in an expansion with a NUL in its internal data-arrays. When you look at it like that you might begin to see that every field-split shell expansion is an $IFS-delimited data array. $IFS delimits expansions that are not already otherwise delimited - which you can do with "double-quotes. When you quote an expansion you delimit it at the head and at least to the tail of its value and so $IFS does not apply as there are no fields to separate. In fact, a double-quoted expansion exhibits identical field-splitting behavior to an unquoted expansion when IFS= is set to an empty value.

Unless quoted, $IFS is itself an $IFS delimited shell expansion. It defaults to a specified value of <space><tab><newline> - all three of which exhibit special properties when contained within $IFS. Whereas any other value for $IFS is specified to evaluate to a single expanded field or per occurrence, $IFS whitespace - any of those three - is specified to elide to a single field per sequence and leading/trailing sequences are elided entirely. This is probably easiest to understand via example.

slashes=///// spaces=' ' IFS=/; printf '<%s>' $slashes$spaces <><><><><>< > IFS=' '; printf '<%s>' $slashes$spaces </////> IFS=; printf '<%s>' $slashes$spaces <///// > unset IFS; printf '<%s>' "$slashes$spaces" <///// > 

But that's just $IFS - just the word-splitting or whitespace as asked, so what of the the special characters?

The shell - by default - expands certain unquoted tokens (such as ?*[ as noted elsewhere here) into multiple fields when they occur in a list. This is called pathname expansion, or globbing. It is an incredibly useful tool, and, as it occurs after field-splitting in the shell's parse-order it is not affected by $IFS - fields generated by a pathname expansion are delimited on the head/tail of the filenames themselves regardless of whether their contents contain any characters currently in $IFS. This behavior is set to on by default - but it is very easily configured otherwise.

set -f 

That instructs the shell not to glob. Pathname expansion will not occur at least until that setting is somehow undone - such as if the current shell is replaced with another new shell process or....

set +f 

...is issued to the shell. Double-quotes - as they also do for field-splitting - render this global setting unnecessary per expansion. So:

echo "*" * 

...if pathname expansion is currently enabled will likely produce very different results per argument - as the first will expand only to its literal value (the single asterisk character, which is to say, not at all) and the second only to the same if the current working directory contains no filenames which might match (and it matches nearly all of them). However if you do:

set -f; echo "*" * 

...the results for both arguments are identical - the * does not expand in that case.

mikeserv
  • 59.4k
  • 10
  • 122
  • 242