2

I am trying to copy multiple files named "F3.bam" two level of directories to up and then rename these files with the name of the sub-directory after copy.

For example:

/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam /samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam /samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam 

Expected results:

1. The files are first copied two directories level up:

/samples/mydata1/RUN1/ID_date/F3.bam /samples/mydata2/RUN1/ID2_date4/F3.bam /samples/mydataxxx/RUN1/IDxxx_datexxx/F3.bam 

2. The files are renamed according to the name of the current sub-directory:

/samples/mydata1/RUN1/ID_date/ID_date_F3.bam /samples/mydata2/RUN1/ID2_date4/ID2_date4_F3.bam /samples/mydataxxx/RUN1/IDxxx_datexxx/IDxxx_datexxx_F3.bam 

Ideally a bash loop would be great (working on a Mac).

4 Answers 4

1

Here's the TLDR version of my solution: you can use the dirname and basename commands along with process substitution in order to construct the target path for your copy command.

A longer explanation follows.


Here is a (super verbose) script that does roughly what you want using a Bash loop:

#!/bin/bash # copy_and_rename.bash # # Copy multiple files 2 folders up and rename these files # to contain their parent directory as a prefix. # # Set internal field separator to handle spaces in file names IFS=$'\n' # Iterate over the list of file paths for _file_path in $@; do # Get the file name _file_name="$(basename ${_file_path})" echo "${_file_name}" # Get the path to the target directory (two levels above the file) _target_directory_path=$(dirname $(dirname ${_file_path})) echo "${_target_directory_path}" # Get the parent directory of the target directory _parent_directory_path=$(dirname ${_target_directory_path}) echo "${_parent_directory_path}" # Get the name of the parent directory _parent_directory_name=$(basename ${_parent_directory_path}) echo "${_parent_directory_name}" # Construct the new file path _new_file_path="${_target_directory_path}/${_parent_directory_name}_${_file_name}" echo "${_new_file_path}" # Copy and rename the file echo "cp -i \"${_file_path}\" \"${_new_file_path}\"" cp -i "${_file_path}" "${_new_file_path}" echo done 

You can obviously compress this a lot, but I kept it this way for explanatory value.

Here is what the preceding script looks like without any comments or superfluous variables or echo statements:

for _file_path in $@; do cp -i "${_file_path}" \ "$(dirname $(dirname ${_file_path}))/$(basename $(dirname $(dirname $(dirname ${_file_path}))))_$(basename ${_file_path})" done 

It's very fragile and doesn't do much in the way of error-handling. I also left in some echo statements for debugging so you see what it's doing and can sanity-check it when you run it for the first time.

To test it I created your files by using the following script, which I include here in case you find it useful for further testing:

#!/bin/bash # create_test_files.bash # Set internal field separator to handle spaces in file names IFS=$'\n' # Choose an prefix for the file paths _prefix="/tmp" # Create array of sample files _sample_files=( "/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam" "/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam" "/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam" ) # Create directories and files for _file in "${_sample_files[@]}"; do # Add the prefix to the path _path="${_prefix}${_file}" # Create parent directory mkdir -p "$(dirname ${_path})" # Create file touch "${_path}" done 

I check that the files were created properly by using the find command:

$ find /tmp/samples -type f /tmp/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam /tmp/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam /tmp/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam 

Then I invoke the script like this:

bash copy_and_rename.bash \ /tmp/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam \ /tmp/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam \ /tmp/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam 

And then I check that the script worked by using find again:

$ find /tmp/samples -type f /tmp/samples/mydata1/RUN1/ID_date/PCR2/ID_date_F3.bam /tmp/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam /tmp/samples/mydata2/RUN1/ID2_date4/PCR2/ID2_date4_F3.bam /tmp/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam /tmp/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/IDxxx_datexxx_F3.bam /tmp/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam 

Finally, I delete all of the test files, also using find:

find /tmp/samples -type f -exec rm {} \; 
1
  • That looks very promising! I will give it a try asap and keep you posted Commented Oct 16, 2017 at 1:56
1

This version uses only bash parameter substitution to slice and dice paths. Pass it one or more absolute file paths:

#!/bin/env bash for path; do dir="${path%/*}" dest="${dir%/*/*}" cp "$path" "${dest}/${dest##*/}_${path##*/}" done 

Here is an expanded version. This one also accepts relative paths and the number of parent dirs to traverse is tunable:

#!/bin/env bash # Each param for this script is the path of a file. It # accepts relative paths if you have appropriate tool to # robustly determine absolute paths (not trivial). Here # we're using GNU 'realpath' tool. # # Usage: copy2up filepath1 [filepath2...] # for converting relative paths to absolute # if it's missing replace realpath with available tool # (or just always use absolute path arguments) pathtool=realpath # directory levels upwards to copy files levels=2 # iterate over each parameter for path; do if [[ ! $path =~ ^/ ]]; then # convert relative to absolute path="$($pathtool $path)" fi file="${path##*/}" dir="${path%/*}" dest=$dir # chdir upwards 'levels' times to destination for (( i=0; i<$levels; i++ )); do dest="${dest%/*}" done # to be prepended to original filename destpfx="${dest##*/}" newpath="${dest}/${destpfx}_${file}" cp "$path" "$newpath" done 

As for your specific use case, you could run this with find if that's how you are locating your 'F3.bam' files. For example:

find /some/path -name F3.bam -exec copy2up.sh {} + 
1

Using find and shell (POSIX sh/bash/Korn/zsh) parameter substitution expansion as following.

find . -type f -name "F3.bam" -execdir sh -c ' trgt="${PWD%/*/*}"; echo cp -v "$1" "${trgt}/${trgt##*/}_${1#./}" ' _ '{}' \; 

Explanations::

We are looking for the files only matching F3.bam and with -execdir here, find is changing the current directory to the directory where file F3.bam found then execute the sh -c ' ... ' within that directory itself.

With trgt="${PWD%/*/*}" "cut-up-to-first-suffix": We are getting ride of the filename itself and two level of its sub-directories in /samples/mydata1/RUN1/ID_date**/PCR2/TIME1** (bold part that matches /*/* suffix will remove) and assign to variable trgt. So trgt is now set to /samples/mydata1/RUN1/ID_date for first file.

The "$1" is relative filepath ./filename to the current $PWD.

In ${trgt##*/}_ "cut-up-to-last-prefix": We used trgt variable value to get the sub-directory name which should pre-pend to the filename, So this will return only ID_date, ID2_date4 or IDxxx_datexxx, etc (removing everything until last slash / seen) and add a underscore _.

This ${1#./} removes point-slash ./ from the relative ./filepath.

0
0

You can nest dirname as often as you like:

set /samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam \ /samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam \ /samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam for bam; do dir="$(dirname "$(dirname "$(dirname "$bam")")")" mv "$bam" "$dir"/"$(basename "$dir")"_"$(basename "$bam")" done 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.