38

I am currently trying to a make a script that would grep input to see if something is of a certain file type (zip for instance), although the text before the file type could be anything, so for instance

something.zip this.zip that.zip 

would all fall under the category. I am trying to grep for these using a wildcard, and so far I have tried this

grep ".*.zip" 

But whenever I do that, it will find the .zip files just fine, but it will still display output if there are additional characters after the .zip so for instance .zippppppp or .zipdsjdskjc would still be picked up by grep. Having said that, what should I do to prevent grep from displaying matches that have additional characters after the .zip?

4
  • i find it better to use ripgrep Commented Aug 28, 2020 at 12:19
  • @cregox, you might be on a system that does not allow you to install rip grep though. Commented Apr 27, 2021 at 16:40
  • @daniel yes. and many other possible error scenarios. i still find it better, though. Commented May 10, 2021 at 12:08
  • Why any of the answers posted here doesn't work with the jar command?? I'm trying to grep some files within a JAR file using this: jar tf name-of-my-file.jar | plus any of the given grep answers here but it returns nothing while it should... Any idea why? Commented Dec 30, 2022 at 2:08

11 Answers 11

87

Test for the end of the line with $ and escape the second . with a backslash so it only matches a period and not any character.

grep ".*\.zip$" 

However ls *.zip is a more natural way to do this if you want to list all the .zip files in the current directory or find . -name "*.zip" for all .zip files in the sub-directories starting from (and including) the current directory.

Sign up to request clarification or add additional context in comments.

3 Comments

How about grep "\.zip"
@Steve the \.zip$ uses the $ to denote end of line. This means that even a file with ".zip" in the filename (which would be crazy) would not trigger the filter. The file must have a .zip extension to be caught by the filter.
What is the purpose of the first dot in the grep command?
21

On UNIX, try:

find . -type f -name \*.zip 

Comments

8

You can also use grep to find all files with a specific extension:

find .|grep -e "\.gz$" 

The . means the current folder. If you want to specify a folder other than the current folder, just replace the . with the path of the folder. Here is an example: Let's find all files that end with .gz and are in the folder /var/log

 find /var/log/ |grep -e "\.gz$" 

The output is something similar to the following:

 ✘ ⚙> find /var/log/ |grep -e "\.gz$" /var/log//mail.log.1.gz /var/log//mail.log.0.gz /var/log//system.log.3.gz /var/log//system.log.7.gz /var/log//system.log.6.gz /var/log//system.log.2.gz /var/log//system.log.5.gz /var/log//system.log.1.gz /var/log//system.log.0.gz /var/log//system.log.4.gz 

The $ sign says that the file extension is ending with gz

Comments

5

You need to do a couple of things. It should look like this:

grep '.*\.zip$' 

You need to escape the second dot, so it will just match a dot, and not any character. Using single quotes makes the escaping a bit easier.

You need the dollar sign at the end of the line to indicate that you want the "zip" to occur at the end of the line.

Comments

5

I use this to get a listing of the file types inside a folder.

find . -type f | egrep -i -E -o "\.{1}\w*$" | sort -su 

Outputs for example:

.DS_Store .MP3 .aif .aiff .asd .doc .flac .jpg .m4a .m4p .m4r .mp3 .pdf .png .txt .wav .wma .zip 

BONUS: with

find . -type f | egrep -i -E -o "\.{1}\w*$" | sort | uniq -c 

You'll get the file count:

 106 .DS_Store 35 .MP3 89 .aif 5 .aiff 525 .asd 1 .doc 60 .flac 48 .jpg 149 .m4a 11 .m4p 1 .m4r 12844 .mp3 1 .pdf 5 .png 9 .txt 108 .wav 44 .wma 2 .zip 

Comments

4
grep -r pattern --include="*.txt" /path/to/dir/ 

Comments

2

Try: grep -o -E "(\\.([A-z])+)+"

I used this to get multi-dotted/multiple extensions. So if the input was hello.tar.gz, then it would output .tar.gz. For single dotted, use grep -o -E "\\.([A-z])+$". Tested on Cygwin/MingW+MSYS.

Comments

2

One more fix/addon of the above example:

# multi-dotted/multiple extensions grep -oEi "(\\.([A-z0-9])+)+" file.txt # single dotted grep -oEi "\\.([A-z0-9])+$" file.txt 

This will get file extensions like '.mp3' and etc.

Comments

2

Just reviewing some of the other answers. The .* isn't necessary, and if you're looking for a certain file extension, it's best to include -i so that it's case-insensitive; in case the file is HELLO.ZIP, for example. I don't think the quotes are necessary, either.

grep -i \.zip$ 

1 Comment

This is the best answer in my opinion, since it uses the least amount of characters to get the desirable outcome, and its case-insensitive, which is important for a wildcard type of functionality.
2

If you just want to find in the current folder, why not with this simple command without grep ?

ls *.zip 

Comments

0

Simply do :

grep ".*.zip$" 

The "$" indicates the end of line

1 Comment

Note, this would include files such as hello.unzip or hi.xzip, or even hellozip. You should escape the second "."

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.