2

I have a script equijoin2:

#! /bin/bash # default args delim="," # CSV by default outer="" outerfile="" # Parse flagged arguments: while getopts "o:td:" flag do case $flag in d) delim=$OPTARG;; t) delim="\t";; o) outer="-a $OPTARG";; ?) exit;; esac done # Delete the flagged arguments: shift $(($OPTIND -1)) # two input files f1="$1" f2="$2" # cols from the input files col1="$3" col2="$4" join "$outer" -t "$delim" -1 "$col1" -2 "$col2" <(sort "$f1") <(sort "$f2") 

and two files

$ cat file1 c c1 b b1 $ cat file2 a a2 c c2 b b2 

Why does the last command fail? Thanks.

$ equijoin2 -o 2 -d " " file1 file2 1 1 a a2 b b1 b2 c c1 c2 $ equijoin2 -o 1 -d " " file1 file2 1 1 b b1 b2 c c1 c2 $ equijoin2 -d " " file1 file2 1 1 join: extra operand '/dev/fd/62' 

1 Answer 1

4

"$outer" is a quoted scalar variable so it always expands to one argument. If empty or unset, that still expands to one empty argument to join (and when you call your script with -o2, that's one -a 2 argument instead of the two arguments -a and 2).

Your join is probably GNU join in that it accepts options after non-option arguments. That "$outer" is a non-option argument when empty as it doesn't start with - so is treated as a file name and join complains about the third file name provided which it doesn't expect.

If you want a variable with a variable number of arguments, use an array:

outer=() ... (o) outer=(-a "$OPTARG");; ... join "${outer[@]}" 

Though here you could also do:

outer= ... (o) outer="-a$OPTARG";; ... join ${outer:+"$outer"} -t "$delim" -1 "$col1" -2 "$col2" -- \ <(sort -t "$delim" -k"$col1,$col1" < "$f1") \ <(sort -t "$delim" -k"$col2,$col2" < "$f2") 

Or:

unset -v outer ... (o) outer="$OPTARG";; ... join ${outer+-a "$outer"} ... 

(that one doesn't work in zsh except in sh/ksh emulation).

Some other notes:

  • join -t '\t' doesn't work. You'd need delim=$'\t' to store a literal TAB in $delim
  • Remember to use -- when passing arbitrary arguments to commands (or use redirections where possible). So sort -- "$f1" or better sort < "$f1" instead of sort "$f1".
  • Also bear in mind that for join, the inputs must be sorted on the join key, not the whole line.
  • arithmetic expansions are also subject to split+glob so should also be quoted (shift "$((OPTIND - 1))") (here not a problem though as you're using bash which doesn't inherit $IFS from the environment and you're not modifying IFS earlier in the script, but still good practice).
9
  • Thanks. For sort -t '\t', (1) does it also apply to join -t ? (2) coreutils manual doesn't mention that or I miss it. The manual says "To specify ASCII NUL as the fi eld separator, use the two-character string‘ \0’, e.g., ‘sort -t ’\0’’." Is \t '\0' an exception? Commented Jul 24, 2018 at 20:41
  • @Tim, my bad, I meant join -t, not sort -t. join -t '\0' is a GNU extension. Generally, other implementations of text utilities can't cope with NUL bytes as that's not text. NUL is the one byte that can't be passed as argument to an executed command, so it has to be represented by some form of encoding. Commented Jul 24, 2018 at 20:46
  • @Tim, that's not bash, that's GNU join which chooses to understand \0 as the NUL byte. bash's $'\0' actually expands to the empty string, not a NUL byte. zsh's $'\0' expands to a NUL byte but only works for builtins or functions. A NUL byte can't be passed as argument to a command that is executed because the list of argument passed to the execve() system call is a list of NUL-delimited strings. Commented Jul 24, 2018 at 21:08
  • Sorry deleted the comment. But please keep your reply. The reason I asked if it is bash's ANSI C quoting is "sort won’t accept ‘\t’, since it treats it as a multi-byte character. The solution is to place a $ before it. The dollar sign tells bash to use ANSI-C quoting" robfelty.com/2008/07/14/… Is it wrong? Commented Jul 24, 2018 at 21:08
  • "join -t '\0' is a GNU extension". Do you mean the GNU extension allows just for join -t '\0' or also for other such as join -t '\t'? Commented Jul 24, 2018 at 21:09

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.