tar in parallel [duplicate]

Question

I have a directory with several hundred sub-directories. I want to tar and compress each sub-directory and name the resulting file <currentDirName>.tar.bz2.

I have done:

find ./dir -type f -print0 | xargs -0 -n1 -P100 bunzip2

to decompress 100 files at one time. (I do ocean modeling so my machine is quite powerful with many quad cores.) I do not know how to do something like the above while including information on naming the compressed archive file on the fly. I don't want to do tar cfj dir.tar.bz2 dir for each one.

Can I do it with each directory name being used automatically as the tar file name - via either find or parallel so that 100 or more can be done at a time as in the bunzip command above?

Thank you for input.

....Peter

The tar process will mostly wait for disk, but when invoked with the -j option it will spawn bzip2 which probably will consume 100% of 1 core. It is possible to split the input for bzip2 and concatenate outputs of bzip2 and that can be used to parallelize compression with tar. Instead of calling tar with -j you can pipe its output to bzip2 or gnu parallell calling bzip2. — Henrik Carlqvist
– Henrik Carlqvist, Commented Jul 21, 2015 at 17:18

wyrm · Accepted Answer · 2015-07-21 17:38:50Z

The following bash one-liner will do approximately what you describe, putting each directory into its own tarball.

for d in dir/*/; do { tar -cj "$d" > "${d%/}.tar.bz2" ; } & done ; while [ "$(jobs)" ] ; do fg &>/dev/null ; done ; echo done

Stack Exchange Network

tar in parallel [duplicate]

1 Answer 1

Linked

Hot Network Questions

tar in parallel [duplicate]

1 Answer 1

Linked

Related

Hot Network Questions