1

I'm having trouble condensing several regex down to an efficient one-liner. I have file names which are named like this: Something (0482) - a123b456 - Something [00xcf bxc v32 Something]. I'd like the results to be something-a123b456-Something or something_-_a123b456_-_Something

Here are the regex I'm trying to condense:

's/(^.*)/\L\1/' # makes the whole string lowercase 's/\(.*?\)|_//gs' # removes everything between parentheses 's/\[.*?\]|_//gs' # removes everything between square brackets 's/ /_/g' # substitutes whitespaces with underscores 

I've tried to chain the commands together, both by hand and by using this site, but regex is not my forte. I'd really appreciate if someone could tell me how one chains several commands together, so that I can do it myself the next time.

I am using prename (Perl), by the way.

0

1 Answer 1

1

Generally Perl expressions are chained together with ; so s/.../foo/;s/.../bar/;... for a rename style chaining that operates on the implicit $_ variable. I'm not sure where you got prename from so I'll use my own version of rename here. Probably it is pretty similar to yours. The -p flag is to preview, or to prevent damage to the filesystem.

$ touch 'Something (0482) - a123b456 - Something [00xcf bxc v32 Something].demo' $ rename -p 's/(^.*)/\L\1/;s/\(.*?\)|_//gs;s/\[.*?\]|_//gs;s/ /_/g' *.demo rename Something (0482) - a123b456 - Something [00xcf bxc v32 Something].demo something__-_a123b456_-_something_.demo 

However this can probably be improved; there is no reason to use regular expressions for everything here.

$ rename -p '$_=lc; s/\(.*?\)|_//gs;s/\[.*?\]|_//gs; tr/ /_/' *.demo rename Something (0482) - a123b456 - Something [00xcf bxc v32 Something].demo something__-_a123b456_-_something_.demo 

So instead we use $_=lc to lowercase everything in $_, and replace the s/ /_/ with a tr. Or maybe instead runs of white space should be replaced with a single underscore? If so, s/\s+/_/g. The () and [] can probably also be improved on, though that gets more complicated to properly match on such balanced expressions.

On additional study s/\(.*?\)|_//gs does not make much sense; there are better ways to kill off _ characters without that (repeated!) alternation in the () and [] killing expressions, so:

$ rename -p '$_=lc; tr/_//; s/\(.*?\)//gs;s/\[.*?\]//gs; tr/ /_/' *.demo rename Something (0482) - a123b456 - Something [00xcf bxc v32 Something].demo something__-_a123b456_-_something_.demo 

The .*? could probably be made more efficient by using something like s/\([^)]*\)//gs to match only-that-which-is-not-the-closing-character but you probably want readability more than efficiency here. But if you are using regular expressions in one-liners you've already blown your readability budget.

$ rename -p '$_=lc; tr/_//; s/\([^)]*\)//g; s/\[[^\]]*\]//g; tr/ /_/' *.demo rename Something (0482) - a123b456 - Something [00xcf bxc v32 Something].demo something__-_a123b456_-_something_.demo 
1
  • This worked perfectly! Thank you for the excellent write up/explanation. You're right about the readability budget, it was more to learn how to chain up expressions. Commented Aug 26, 2022 at 9:15

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.