1

i'm beginning in bash script but here is my problem

i've been struggling with the find.

When i do find . ! -regex ".*(jpeg|jpg|gif|pdf)+$" the find can't find anything though there is actually files not with this extension. As if it doesn't recognize the regex group

I've found find . ! \( -name '*.jpeg' -o -name '*.jpg' -o -name '*.gif' \)

My question is : Is there not a better way to do this ?

2
  • You are missing the . before the file extension. Commented Dec 23, 2016 at 17:38
  • 1
    Your -name method is actually good and portable (-regex is not specified by POSIX, it's only an extension available in certain versions of find, e.g., GNU find). Commented Dec 23, 2016 at 18:13

2 Answers 2

3

The proper regEx should have been

find . ! -regex '.*\.\(jpeg\|jpg\|gif\|pdf\)' 

Notice the inclusion of . after the .* to match the character before file-name extension and escape character for each of the types associated.

Remember you are negating the regEx to exclude the files of the extensions mentioned. To actually list the files needed only of these extensions, drop the ! as

find . -regex '.*\.\(jpeg\|jpg\|gif\|pdf\)' 
Sign up to request clarification or add additional context in comments.

2 Comments

Oh indeed i'm so dumb, falling for that.
@Biscuit: It is quite a nice attempt! just a cosmetic issue not to worry much about!
2

As if it doesn't recognize the regex group

That's exactly what's happening.

There's nothing wrong with your regex at all, but it's written in a PCRE or ERE dialect that find doesn't expect. If you tell find to interpret it as ERE, it will match as you intended:

# GNU find . -regextype posix-extended ! -regex ".*(jpeg|jpg|gif|pdf)+$" # macOS find . -E ! -regex ".*(jpeg|jpg|gif|pdf)+$" 

It would also work just fine by default in Perl, Java, RE2, egrep, bash =~, awk, and a whole lot of other tools that also use PCRE or ERE.

However, it does not work in Emacs or BRE, which is what GNU and macOS find expect respectively.

Inian's solution works by rewriting your pattern from ERE style to Emacs style, where \(\|\) is used instead of (|) (as well as making other improvements to it).

tl;dr: Copy-pasting a regex from one tool to another is like copy-pasting a function from Java to C#. They look very similar and it may even work, but it's likely to require at least some tweaking.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.