How to use GNU Parallel

Question

How do I use GNU parallel with aws sync command?

I have a file with the following commands:

aws s3 cp ./test s3://test --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.html" --profile $PROFILE aws s3 cp ./test s3://test $S3BUCKET --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.css" --profile $PROFILE

How can I use GNU parallel to run these commands in parallel?

What I did was add the commands in a file called test.sh

and I run the following command

parallel < test.sh

] How do I pass in arguments to the test.sh file? For example, I want to pass in the aws bucket name.

If you just want to run a few commands in parallel, use & instead of GNU parallel. — that other guy
– that other guy, Commented Nov 16, 2015 at 23:39
that runs it in the background. I need GNU parallel so that if one command fails, the whole thing fails — James Umeris
– James Umeris, Commented Nov 17, 2015 at 0:44
--halt 2 is the GNU parallel option that kills all jobs if one fails. --halt 1 just doesn't execute any more jobs, which is what & would accomplish. — that other guy
– that other guy, Commented Nov 17, 2015 at 4:44

Charles Duffy · Accepted Answer · 2015-11-17 21:13:53Z

If your goal is to trigger a script failure if any member of a set of hand-written commands fails, GNU parallel isn't the best tool for the job: The shell itself already provides everything needed with the wait command, which is specified by POSIX and present out-of-the-box on all standards-compliant shells (see also the specification requiring it to be implemented as a builtin).

#!/bin/bash # ^^^^- Important! /bin/sh doesn't have arrays; bash, ksh, or zsh will work. # For readability, put common arguments in an array common_args=( --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --profile "$PROFILE" ) # Record PIDs of the various jobs in an array pids=( ) aws s3 cp ./test s3://test --include='*.html' "${common_args[@]}" & pids+=( $! ) aws s3 cp ./test s3://test "$S3BUCKET" --include='*.css' "${common_args[@]}" & pids+=( $! ) # If either background job failed, exit the script with the same exit status for pid in "${pids[@]}"; do wait "$pid" || exit done

Note that arrays are used above for convenience, not necessity; you could provide the common arguments with a function, and/or build up the array of PIDs in a scalar variable or by overriding "$@" inside a shell function if your goal were to write code that would work on any POSIX baseline shell.

I have about 10+ of this aws commands to run. I need them to run at the same time
The above does run the aws s3 cp commands at the same time. If you want an actual proper loop to start these jobs, that'd be easy too, but your question doesn't give much to go on re: tuning the answer beyond what I've already done.
(Are you, for instance, trying to iterate over pairs of S3BUCKET and PROFILE values?)
No I have 18 aws sync commands. They don't all have the same S3bucket. 6 of them have the same S3 bucket
...so, if you wanted to use the answer here, you could copy the pattern, with aws s3 ... & pids+=( $! ) for each command you want to run, and the wait loop to check exit status after and exit with an error if any failed.

Ole Tange · Accepted Answer · 2022-11-21 08:33:04Z

Untested:

export MAXAGE export PROFILE export S3BUCKET part1() { aws s3 cp ./"$1" s3://"$1" --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.html" --profile $PROFILE } export -f part1 part2() { aws s3 cp ./"$1" s3://"$1" $S3BUCKET --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.css" --profile $PROFILE } export -f part2 parallel --halt 2 ::: part1 part2 ::: test testfile2

Use different $S3BUCKET:

doit() { file="$1" bucket=$2 aws s3 cp ./"$file" s3://"$file" $bucket --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.css" --profile $PROFILE } export -f doit # '' = no bucket parallel --halt 2 doit ::: test testfile2 ::: '' bucket1 bucket2 bucket3

that other guy · Accepted Answer · 2015-11-17 19:59:59Z

You can do this by exporting all the required variables and then piping the commands to parallel with a quoted here-documented:

#!/bin/bash export S3BUCKET="$1" export MAXAGE=42 export PROFILE=foobar parallel --gnu << 'EOF' aws s3 cp ./test s3://test --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.html" --profile $PROFILE aws s3 cp ./test s3://test $S3BUCKET --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.css" --profile $PROFILE EOF

Collectives™ on Stack Overflow

How to use GNU Parallel

3 Answers 3

10 Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

10 Comments

Comments

Comments

Related