Rename file by moving middle string to end of the filename

Question

I have some data files and I wish to rename them for my pipeline.

The files look like this:

{unique_ids}_{experiment_condition}_L{3_digit_number}.txt

I need to rename them so the experiment condition flag will appear at the end of the filename, before the extension as follows:

{unique_ids}_L{3_digit_number}_{experiment_condition}.txt

Length of unique_ids and experiment_condition is not fixed.

Example:

ghad312fd2_Mb_L002.txt becomes ghad312fd2_L002_Mb.txt.

Thank You!

Do unique_ids or experiment_condition contain underscores? — choroba
– choroba, Commented Feb 9, 2022 at 14:21
What operating system are you using? Do you have the perl rename command? What is the output of file $(readlink -f $(which rename))? — terdon
– terdon ♦, Commented Feb 9, 2022 at 15:23
Also, you say you have {unique_ids}_L{3_digit_number}_{experiment_condition}.txt, but your example file name (ghad312fd2_Mb_L002.txt) is {unique_ids}_{experiment_condition}_L{3_digit_number}.txt. Can you give us a clearer example? — terdon
– terdon ♦, Commented Feb 9, 2022 at 15:24

Kusalananda · Accepted Answer · 2022-02-09 16:15:08Z

9

Using the Perl-based rename utility to rename all the files in the current directory matching the pattern ./*_*_*.txt (i.e. any file whose nome contains at least two underscores and ends with .txt):

rename -n 's/([^_]+)_([^_]+)\.txt$/$2_$1.txt/' ./*_*_*.txt

This swaps the last two underscore-delimited parts of the filename, excluding the filename suffix .txt. Remove -n to run this for real after ensuring that it seems to be doing the correct thing.

answered Feb 9, 2022 at 16:15

Kusalananda♦

356k42 gold badges737 silver badges1.1k bronze badges

1

@Cbhihe You will have to try installing both. On my (OpenBSD) system, the utility is actually called prename. I would be somewhat surprised if a name collision between the two utilities wasn't appropriately handled somehow.

Kusalananda
– Kusalananda ♦

2022-02-09 19:32:41 +00:00
Commented Feb 9, 2022 at 19:32
3

I did think you were on BSD :-). I just found out that on Archlinux, installing perl-rename does preserve the default utility GNU rename. Tx.

Cbhihe
– Cbhihe

2022-02-09 21:06:43 +00:00
Commented Feb 9, 2022 at 21:06
@Cbhihe, AFAIK, there's no GNU rename. There's (a quite dumb) one in util-linux though, which might be the one installed on your system.

Stéphane Chazelas
– Stéphane Chazelas

2022-02-11 15:46:44 +00:00
Commented Feb 11, 2022 at 15:46
@StéphaneChazelas: You are (depressingly) correct on both counts: there is no specific GNU-flavored rename and the util-linux version installed by default is really "pared down" to bare bones compared to perl-rename. Instead the nifty regex based in-place substitution capability of perl-rename blew me away. perl rules ! (at least sometimes).

Cbhihe
– Cbhihe

2022-02-11 16:22:03 +00:00
Commented Feb 11, 2022 at 16:22
@Cbhihe, yes even the original rename from 33 years ago written as 10 lines of perl code was infinitely better than util-linux's (itself added around 2000 in 2.10e, annoying many Linux users at the time when some distributions started including it)

Stéphane Chazelas
– Stéphane Chazelas

2022-02-11 17:07:59 +00:00
Commented Feb 11, 2022 at 17:07

Add a comment |

Stéphane Chazelas · Accepted Answer · 2022-02-09 19:26:58Z

With the zsh shell:

autoload zmv zmv -n '(*)(_*)(_L[0-9](#c3))(.txt)' '$1$3$2$4'

(remove -n (dry-run) if happy).

[0-9](#c3) matches a sequence of 3 ASCII decimal digits. You can also use <0-999> to match on numbers from 0 to 999 (bearing in mind it would also match on 0000123) or <-> for any number (any sequence of one or more ASCII decimal digits).

RudiC · Accepted Answer · 2022-02-09 16:16:00Z

5

Try also

for FN in gh*; do IFS="_." read ID XC NR EXT <<< "$FN"; echo mv -- "$FN" "${ID}_${NR}_${XC}.${EXT}"; done

It reads four variables from the respective file name in the "here string", and reconstructs the new file name from them. Remove the echo if happy with what you are seeing.

answered Feb 9, 2022 at 16:16

RudiC

9,0592 gold badges12 silver badges22 bronze badges

1

Dont use all upper case for non-exported shell variable names to avoid clashing with existing variables and obfuscating your code, see correct-bash-and-shell-script-variable-capitalization

Ed Morton
– Ed Morton

2022-02-09 18:10:26 +00:00
Commented Feb 9, 2022 at 18:10
1

@EdMorton The argument for case on shell variables seems to be mixed. The accepted answer in your link says to use all lower-case but the comments, including some from long-time Bell Labs employees, have reasons for not sticking to lower case. Readability and distinction from commands are the major arguments.

doneal24
– doneal24

2022-02-09 19:40:01 +00:00
Commented Feb 9, 2022 at 19:40
2

@doneal24 there's one person in the comments arguing for all upper case names and claiming to be an ex Bell Labs employee. As someone who worked at AT&T/Bell Labs/Lucent/etc. for 30 years myself I promise you that being a long term Bell Labs employee doesn't offer any particular authority on shell conventions, it just means you're old. You don't see people using all upper case variable names for readability and distinction from library functions, etc. in C, Java, Go, or any other non-ancient programming language so the argument you should do so in shell just doesn't hold water.

Ed Morton
– Ed Morton

2022-02-09 20:48:13 +00:00
Commented Feb 9, 2022 at 20:48
3

I also started with Fortran around 1978, manually writing programs on graph paper that were snail-mailed to the local college where a secretary typed them onto punch cards for a tech to run through the mainframe to snail-mail the output back a week later telling me that I forgot a semi-colon. Saying "why do it if you know..." is very much like saying if you know you're not going to crash why wear a seat belt? Just like quoting variables, you don't avoid all upper-case variables to protect against what you know about, you do it to protect against surprises.

Ed Morton
– Ed Morton

2022-02-09 22:10:18 +00:00
Commented Feb 9, 2022 at 22:10
2

@doneal24 for some examples, if you look at the well-respected, frequently referenced, and constantly reviewed bash FAQ, mywiki.wooledge.org/BashFAQ, I would be shocked if you see any non-exported variables that are all upper case unless they're in a "what not to do" section. I couldn't find any by poking around just now.

Ed Morton
– Ed Morton

2022-02-09 22:38:24 +00:00
Commented Feb 9, 2022 at 22:38

| Show 1 more comment

schrodingerscatcuriosity · Accepted Answer · 2022-02-10 13:46:15Z

3

I propose this:

for i in *.txt; do n="${i%%.*}" id="$(echo "$n" | cut -d_ -f1)" e="$(echo "$n" | cut -d_ -f2)" d="$(echo "$n" | cut -d_ -f3)" echo mv -- "$i" "${id}_${d}_${e}.txt" done

If you are happy with the result given by echo..., remove it and leave mv -- "$i" "${id}_${d}_${e}.txt" which will actually move the file.

edited Feb 10, 2022 at 13:46

answered Feb 9, 2022 at 14:55

schrodingerscatcuriosity

12.8k5 gold badges38 silver badges64 bronze badges

2

I think the last line should be removed.

rexkogitans
– rexkogitans

2022-02-10 07:06:40 +00:00
Commented Feb 10, 2022 at 7:06
Does this spawn six child processes for each file to rename? Due to $( | ) expansion.

LarsH
– LarsH

2022-02-10 12:56:09 +00:00
Commented Feb 10, 2022 at 12:56
@rexkogitans right, removed, thanks.

schrodingerscatcuriosity
– schrodingerscatcuriosity

2022-02-10 13:46:34 +00:00
Commented Feb 10, 2022 at 13:46

Add a comment |

terdon · Accepted Answer · 2022-02-09 15:25:42Z

Assuming unique_ids has no underscore, put this in a script and run it with GNU sed or any other sed that supports -E, giving your file names as arguments:

#!/bin/bash for f in "$@" ; do new_name=$(echo "$f" | sed -E 's/([^_]+)_(.+)_(L[0-9]{3})\.txt/\1_\3_\2.txt/g') echo "$f -> $new_name" mv "$f" "$new_name" done

DanieleGrassini · Accepted Answer · 2022-02-09 16:00:09Z

With sh:

for f in *.txt do # Getting the extension ext=".${f##*[.]}" # Get the 3 digit number part ext_trail="${f%[.]*}" digit_number="L${ext_trail##*_L}" # tmp variable to get the first two tmp="${ext_trail%_*}" # Get the experiment conditions experimental_condition="${tmp#*_}" # Get the unique id unique_id="${tmp%_*}" echo mv -- "$f" "${unique_id}_${digit_number}_${experimental_condition}${ext}" done

With bash:

for f in *.txt do [[ "$f" =~ ^([^_]*)_([^_]*)(_L[0-9]{3})[.]txt ]] && echo mv -- "$f" "${BASH_REMATCH[1]}${BASH_REMATCH[3]}_${BASH_REMATCH[2]}.txt" done

hfs · Accepted Answer · 2022-02-11 15:05:38Z

Try mmv:

mmv '*_*_*.txt' '#1_#3_#2.txt'

It obviously only works if there are no other underscores present in the file names.

Kaffe Myers · Accepted Answer · 2022-02-09 16:44:58Z

If the format is very robust like that, try to incorporate this:

echo ghad312fd2_Mb_L002.txt | awk -F'[_.]' -v OFS=_ '{ print $1, $3, $2 "." $4 }'

output: ghad312fd2_L002_Mb.txt

Could look like this in a script:

#!/bin/bash for f in *.txt; do mv -v -- "$f" "$(awk -F'[_.]' -v OFS=_ '{ print $1, $3, $2 "." $4 }' <<<"$f")" done

Stack Exchange Network

Rename file by moving middle string to end of the filename

8 Answers 8

You must log in to answer this question.

Hot Network Questions

Rename file by moving middle string to end of the filename

8 Answers 8

You must log in to answer this question.

Related

Hot Network Questions