38

I have 10 text files and I want to paste each file with its pair, such that I have 5 total files.

I tried the following:

for i in 4_1 5_1 6_1 7_1 8_1 do for j in 4_2 5_2 6_2 7_2 8_2 do paste ${i}.txt ${j}.txt > ${i}.${j}.txt done done 

However, this code combines every possible combination instead of just combining the matching pairs.

So I would like file 4_1.txt to be paired with 4_2.txt, 5_1.txt with 5_2.txt, etc.

4
  • You need to loop over prefixes and suffixes not full file names. Commented Feb 25, 2015 at 17:14
  • The behavior you're getting is the same as what a nested for loop would do in any language; there's nothing bash-specific about it. Commented Feb 25, 2015 at 17:19
  • 1
    Every time I want to do something relatively simple like iterating over pairs of values, Bash makes the workload look like a insurmountable epic. So I don't use bash when needing anything more complex than a list of shell calls. Commented Sep 29, 2015 at 23:42
  • See also stackoverflow.com/questions/17403498/…. But that question is not necessarily about arrays. It's only about pairs. (Same is true for some of the answers here). Commented Oct 15, 2021 at 20:18

7 Answers 7

44

I agree with the answer currently proposed by fedorqui in the context of the question currently asked. The below is given only to provide some more general answers.

One more general approach (for bash 4.0 or newer) is to store your pairs in an associative array:

declare -A pairs=( [4_1]=4_2 [5_1]=5_2 [6_1]=6_2 [7_1]=7_2 [8_1]=8_2 ) for i in "${!pairs[@]}"; do j=${pairs[$i]} paste "$i.txt" "$j.txt" >"${i}.${j}.txt" done 

Another (compatible with older releases of bash) is to use more than one conventional array:

is=( 4_1 5_1 6_1 7_1 8_1 ) js=( 4_2 5_2 6_2 7_2 8_2 ) for idx in "${!is[@]}"; do i=${is[$idx]} j=${js[$idx]} paste "$i.txt" "$j.txt" >"$i.$j.txt" done 
Sign up to request clarification or add additional context in comments.

6 Comments

I am disappointed that there is no leaner version of that something like looping over a list and then splitting each element into two parts
This question is specified with two different lists provided. It's easy to do the other thing; it's just not what the OP here asked for.
@CharlesDuffy I'm here because I typed "iterate in pairs bash" into Google, not because I care about OP's problem.
@Boris, ...which is evidence that this question should be edited to have a title that disambiguates from the other, related question -- if there are two distinct problems that can be described with the same words, they should be titled in a way to make it clear which is which.
@CharlesDuffy , could you point to a reference for handling pairs, or better yet, an array of arrays ?
|
35

Simplest so far:

for i in "1 a" "2 b" "3 c"; do a=( $i ); echo "${a[1]}"; echo "${a[0]}"; done a 1 b 2 c 3 

4 Comments

It works on bash ver.3 and above, and simple enough to not require the use of heredoc.
Why would you not put the values in an array right at the start then?
How would this work if you wanted to save the array of pairs in a variable beforehand? Eg.: DATA=("1 a" "2 b" "3 c"); for i in ${DATA[@]}; do echo $i; done Because this iterates over each item seperately: 1, a, 2, b, 3, c ... so 6 times instead of the desired 3 times. (desired output would be: 1 a, 2 b, 3 c)
I found the answer in another post. The solution is to wrap the array in quotes when iterating: for i in "${DATA[@]}" ... . That results in the desired 3 loops over "1 a", "2 b", "3 c".
14

You can use an associative array:

animals=(dog cat mouse) declare -A size=( [dog]=big [cat]=medium [mouse]=small ) declare -A sound=( [dog]=barks [cat]=purrs [mouse]=cheeps ) for animal in "${animals[@]}"; do echo "$animal is ${size[$animal]} and it ${sound[$animal]}" done 

This allows you traversing pairs, triples, etc. Credits: the original idea is taken from @CharlesDuffy-s answer.

3 Comments

Didn't I suggest (and demonstrate) this technique -- in context appropriate to the OP's question, revolving around invocation of paste -- back in 2015?
Valid question. Originally I started as a somewhat different answer, with your answer in "credits". But then in September I some why decided to refactor everything and practically get rid of some of the stuff. I've restored the original answer now.
This has the benefit of working in both bash and zsh as well, which is helpful when used in READMEs where people will often be copying code.
11

If you want to use one variable and perform and action with it, you just need to use one loop:

for file in 4 5 6 7 8 do paste "${file}_1" "${file}_2" done 

This will do

paste 4_1 4_2 paste 5_1 5_2 ... 

4 Comments

Or even paste "${file}"_{1,2}
Look, Ma! No loops: printf '%s %s\n' {4..8}_{1,2} | xargs -I cmd paste cmd
But running xargs on unquoted values quickly cumbles if you have nontrivial file names which require quoting or escaping.
This uses no Bash-only features, and is thus portable to any Bourne-compatible shell.
7

the above did not work for me, but the following does read values in pairs from an ordered list

(can be more than pairs adding extra 'read-lines' :-)

while read x; do read y echo "$x $y" done << '___HERE' X1 Y1 X2 Y2 X3 Y3 ___HERE 

produces

X1 Y1 X2 Y2 X3 Y3 

5 Comments

You can also while read x && read y; do, which I find makes it a little easier for readers to follow that you're consuming two items per iteration.
This is a way simpler and better answer than anything above.
This is buggy. read x and read y corrupt data with literal backslashes when the -r argument is not used; echo $x $y corrupts data in a different way, as described in BashPitfalls #14.
This uses no Bash-only features (though you'll. want to use read -r as pointed out above), and is thus portable to any Bourne-compatible shell.
The blind guidance to always use -r is flawed. Very often (indeed usually), the person/process generating the input specifically wants to use backslash escape sequences, and using -r prevents the producer from doing that. Blindly using -r at all times is .... icky. Sometimes it's appropriate; usually it's not. read corrupts the data when -r is used, not when it is omitted.
4

There is a common pattern where you have pairs of files, where one name of the pair can be easily derived from the other. If the file you know the name of is X and the other file is Y, you have the following common use cases.

  • For renaming, Y is X with an extension removed and/or a date stamp added.
  • For transcoding, Y is X with a different extension and perhaps a different directory.
  • For many data analysis tasks, X and Y share some parts of the file name, but have different parameters or extensions.

All of these lend themselves to the same rough code skeleton.

for x in path/to/base*.ext; do dir=${x%/*} # Trim trailing file name, keep dir base=${x##*/} # Trim any leading directory # In this case, $y has a different subdirectory and a different extension y=${dir%/to}/from/${base%.ext}.newext # Maybe check if y exists? Or doesn't exist? if [ -e "$y" ]; then echo "$0: $y already exists -- skipping" >&2 continue fi mv or ffmpeg or awk or whatever "$x" and "$y" done 

The key here is the observation that y can be derived from x with some simple variable substitutions. So you loop over the x values, and figure out the corresponding y value inside the loop.

Here, we have used the shell's built-in ${variable#prefix} and ${variable%suffix} operators to return the variable's value with any leading prefix or trailing suffix, respectively, trimmed off. (There is also ## and %% to match the longest, instead of the shortest, possible match. The expression after # or % is a regular shell glob pattern.) These should usually be all you need, although you frequently see sed or awk scripts even for this trivial job (where really you should usually try to avoid an external process), as well as of course for more demanding transformations.

If you need to loop over x files scattered across different directories, maybe the loop should start with something like

 find dir1 dir2 etc/and/so/forth -type f -name 'x-files*.ext' -print | while IFS='' read -r x; do : 

A commonly seen problem in similar questions is answers which fail to quote $x and $y correctly. Generally, any variable containing a file name should always be in double quotes.

Where X and Y are unrelated, a common solution is to loop over a here document containing the mapping:

while read -r x y; do : stuff with "$x" and "$y" done <<'____HERE' first_x_value first_y_value another_x corresponding_y random surprise ____HERE 

2 Comments

I can't really write "X Files" with a completely straight face. Sorry.
This doesn't use any strictly Bash-only functionality, but the "longest match" parameter expansions ## and %% are not available in all shells. In the worst case, maybe use sed for the substitutions, or something.
1

It is often convenient to use a while/read loop to do this sort of thing:

#!/bin/sh while read i j; do echo i=${i} j=${j} done << EOF 4_1 4_2 5_1 5_2 6_1 6_2 7_1 7_2 8_1 8_2 EOF 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.