908

I have several hundred PDFs under a directory in UNIX. The names of the PDFs are really long (approx. 60 chars).

When I try to delete all PDFs together using the following command:

rm -f *.pdf 

I get the following error:

/bin/rm: cannot execute [Argument list too long] 

What is the solution to this error? Does this error occur for mv and cp commands as well? If yes, how to solve for these commands?

7
  • 29
    You might find this link useful Commented Jul 2, 2012 at 7:54
  • 2
    related: Solving “mv: Argument list too long”? Commented Apr 24, 2016 at 9:49
  • 1
    Also this can be relevant http://mywiki.wooledge.org/BashFAQ/095 Commented Jun 1, 2017 at 10:46
  • 5
    @jww: And I continued to think for so many years that bash falls under "software tools commonly used by programmers" -- a category whose questions can be asked here! Commented Jan 2, 2018 at 7:18
  • 1
    @jww: not "how to run a command".. but "how to run this particular command without getting the error I was getting"... no? Commented Jan 2, 2018 at 7:22

33 Answers 33

1269

The reason this occurs is because bash actually expands the asterisk to every matching file, producing a very long command line.

Try this:

find . -name "*.pdf" -print0 | xargs -0 rm 

Warning: this is a recursive search and will find (and delete) files in subdirectories as well. Tack on -f to the rm command only if you are sure you don't want confirmation.

You can do the following to make the command non-recursive:

find . -maxdepth 1 -name "*.pdf" -print0 | xargs -0 rm 

Another option is to use find's -delete flag:

find . -name "*.pdf" -delete 
Sign up to request clarification or add additional context in comments.

3 Comments

Find has a -delete flag to delete the files it finds, and even if it didn't it would still be considered better practice to use -exec to execute rm, rather than invoking xargs(which is now 3 processes and a pipe instead of a single process with -delete or 2 processes with -exec).
Only the last part of the answer is actually useful. The pipe-to-xargs suffers from basically the same problem as using globs with rm. But - that's not part of the original answer. It's not appropriate to edit a valid answer into an invalid one. This is a problem IMHO.
... and the correct part was actually grafted onto this answer from a different answer which god it right. So please at least upvote that one instead.
596

tl;dr

It's a kernel limitation on the size of the command line argument. Use a for loop instead.

Origin of problem

This is a system issue, related to execve and ARG_MAX constant. There is plenty of documentation about that (see man execve, debian's wiki, ARG_MAX details).

Basically, the expansion produce a command (with its parameters) that exceeds the ARG_MAX limit. On kernel 2.6.23, the limit was set at 128 kB. This constant has been increased and you can get its value by executing:

getconf ARG_MAX # 2097152 # on 3.5.0-40-generic 

Solution: Using for Loop

Use a for loop as it's recommended on BashFAQ/095 and there is no limit except for RAM/memory space:

Dry run to ascertain it will delete what you expect:

for f in *.pdf; do echo rm "$f"; done 

And execute it:

for f in *.pdf; do rm "$f"; done 

Also this is a portable approach as glob have strong and consistant behavior among shells (part of POSIX spec).

Note: As noted by several comments, this is indeed slower but more maintainable as it can adapt more complex scenarios, e.g. where one want to do more than just one action.

Solution: Using find

If you insist, you can use find but really don't use xargs as it "is dangerous (broken, exploitable, etc.) when reading non-NUL-delimited input":

find . -maxdepth 1 -name '*.pdf' -delete 

Using -maxdepth 1 ... -delete instead of -exec rm {} + allows find to simply execute the required system calls itself without using an external process, hence faster (thanks to @chepner comment).

References

7 Comments

The find -exec solution seems to be MUCH faster than the for loop.
the for loop is painfully slow. tried this on a directory with 100,000+ files in it and 30 seconds later it had only deleted 12,000 or so. Tried the find version and it was done in half a second
Five years later at 4.15.0 (4.15.0-1019-gcp to be exact) and the limit is still at 2097152. Interestingly enough, searching for ARG_MAX on the linux git repo gives a result showing ARG_MAX to be at 131702.
If you're ever worried about making sure that you wrote your arguments right and the correct files will be deleted, you can replace the find command with find . -maxdepth 1 -name '*.pdf' -print. Which will show you the file list. Then if it looks fine, replace the print back with the -delete.
Works for multiple globs too for for in *.pdf my_dir/*.pdf; ...
|
216

find has a -delete action:

find . -maxdepth 1 -name '*.pdf' -delete 

7 Comments

This would still return "Argument list too long". At least for me it does. Using xargs, as per Dennis' answer, works as intended.
Strange. Are you sure you did not skip the quotes?
That sounds like a bug in find.
@Sergio had same issue, it was caused by the missing quotes around name pattern.
argh, why does a tool for finding stuff even have a switch for deleting? is it really just me who find it unnecessary to say the least and also dangerous.
find can operate on names found in several ways, just have a look at its man page
@mathreadler It addresses the fact that a common use case for -exec is to remove a bunch of files. -exec rm {} + would do the same thing, but still requires starting at least one external process. -delete allows find to simply execute the required system calls itself without using an external wrapper.
28

For somone who doesn't have time. Run the following command on terminal.

ulimit -S -s unlimited 

Then perform cp/mv/rm operation.

7 Comments

This doesn't work because the OP hit the /bin/rm tool's argv size limit.
In fact /bin/rm has no limit on argv, the limit is on the exec system call as explained by Édouard Lopez
Worked well for me for the error message -bash: /bin/rm: Argument list too long
If you try to add more arguments, you will just hit the next limite
Thanks, worked perfectly for rsync for me for 168k files. Just be aware if you have literally billions of files, it could exhaust your RAM, then start swapping, and then the computer might halt. But you can check with "ls | wc", and if it is only like a million, it is no problem, if you have gigabytes of RAM.
It just increases the max lenght of the command line. On my system it goes up 6291456, so if you exceed that you will hit the limit: you do not need billion files, it will happen much sooner.
Changing the stack limit with ulimit -s has in fact an effect on ARG_MAX, don't know why. In my system, on kernel 6.16.9, it increases to 6291456 up from the standard 2505728, so it can accommodate a command line more than double the standard size. So this does not solve the problem, just moves it.
27

If you’re trying to delete a very large number of files at one time (I deleted a directory with 485,000+ today), you will probably run into this error:

/bin/rm: Argument list too long. 

The problem is that when you type something like rm -rf *, the * is replaced with a list of every matching file, like “rm -rf file1 file2 file3 file4” and so on. There is a relatively small buffer of memory allocated to storing this list of arguments and if it is filled up, the shell will not execute the program.

To get around this problem, a lot of people will use the find command to find every file and pass them one-by-one to the “rm” command like this:

find . -type f -exec rm -v {} \; 

My problem is that I needed to delete 500,000 files and it was taking way too long.

I stumbled upon a much faster way of deleting files – the “find” command has a “-delete” flag built right in! Here’s what I ended up using:

find . -type f -delete 

Using this method, I was deleting files at a rate of about 2000 files/second – much faster!

You can also show the filenames as you’re deleting them:

find . -type f -print -delete 

…or even show how many files will be deleted, then time how long it takes to delete them:

root@devel# ls -1 | wc -l && time find . -type f -delete 100000 real 0m3.660s user 0m0.036s sys 0m0.552s 

3 Comments

Thanks. I did sudo find . -type f -delete to delete about 485 thousand files and it worked for me. Took about 20 seconds.
I had to delete 7.4 million files. oh lord, this is the most sense answer. I love you..
Deleting 63k files in shared/storage/framework/sessions took 1.3s on my Digital Ocean server. Thanks.
25

Another answer is to force xargs to process the commands in batches. For instance to delete the files 100 at a time, cd into the directory and run this:

echo *.pdf | xargs -n 100 rm

7 Comments

For deleting command in linux, which can be a disaster if you are an engineer and you typed a mistake, I believe it is the 'safest and I know what's going on' is the best one. Not fancy stuff that if you miss type a dot will let your company crash down in one minute.
If you want it to be safe, you'll need to use null-terminated file names, see my comment below
How can we make this the default expansion for certain commands? There's a good many "standard" linux commands where it's known if they need them all at once or not (like "rm")
Note that this only works where echo is a shell builtin. If you end up using the command echo, you'll still run into the program arguments limit.
There is also still the inherent problem that echo could produce something else than the literal file names. If a file name contains a newline, it will look to xargs like two separate file names, and you get rm: firsthalfbeforenewline: No such file or directory. On some platform, file names which contain single quotes will also confuse xargs with the default options. (And the -n 100 is probably way too low; just omit the option to let xargs figure out the optimal number of processes it needs.)
I choose -n 100 as a conservatively low number because I don't think there is any reason to increase n to the point of being "optimal" here.
|
15

Or you can try:

find . -name '*.pdf' -exec rm -f {} \; 

4 Comments

This deletes files from subdirectories as well. How to prevent that ?
@NikunjChauhan Add -maxdepth option: find . -maxdepth 1 -name '*.pdf' -exec rm -f {} \;
I am not able to insert the maxdepth option
That option may be a Linux-only option, as per @Dennis's answer, above (the selected answer).
14

you can try this:

for f in *.pdf do rm "$f" done 

EDIT: ThiefMaster comment suggest me not to disclose such dangerous practice to young shell's jedis, so I'll add a more "safer" version (for the sake of preserving things when someone has a "-rf . ..pdf" file)

echo "# Whooooo" > /tmp/dummy.sh for f in '*.pdf' do echo "rm -i \"$f\"" done >> /tmp/dummy.sh 

After running the above, just open the /tmp/dummy.sh file in your favorite editor and check every single line for dangerous filenames, commenting them out if found.

Then copy the dummy.sh script in your working dir and run it.

All this for security reasons.

5 Comments

I think this would do really nice things with a file named e.g. -rf .. .pdf
yes it would, but generally when used in shell, the issuer of the command "should" give a look at what he's doing :). Actually I prefer to redirect to a file and then inspect every single row.
These commands are intended for directories with hundred thousands files. Some of which can have unintended names, usually by mistake. How can you give a look at them all?
This doesn't quote "$f". That's what ThiefMaster was talking about. -rf takes precedence over -i, so your 2nd version is no better (without manual inspection). And is basically useless for mass delete, because of prompting for every file.
I added the missing quotes, but this still has issues. Manually inspecting thousands of files is error-prone and unnecessary.
12

I'm surprised there are no ulimit answers here. Every time I have this problem I end up here or here. I understand this solution has limitations but ulimit -s 65536 seems to often do the trick for me.

1 Comment

It just moves the limit somewhat on, see my other comment about ulimit
10

If they are filenames with spaces or special characters, use:

find -name "*.pdf" -delete 

For files in current directory only:

find -maxdepth 1 -name '*.pdf' -delete 

This sentence search all files in the current directory (-maxdepth 1) with extension pdf (-name '*.pdf'), and then, delete.

3 Comments

While this code snippet may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post. Remember that you are answering the question for readers in the future, not just the person asking now! Please edit your answer to add explanation, and give an indication of what limitations and assumptions apply.
The whole point of -exec is that you don't invoke a shell. The quotes here do absolutely nothing useful. (They prevent any wildcard expansion and token splitting on the string in the shell where you type in this command, but the string {} doesn't contain any whitespace or shell wildcard characters.)
using find, but still too long
8

You could use a bash array:

files=(*.pdf) for((I=0;I<${#files[@]};I+=1000)); do rm -f "${files[@]:I:1000}" done 

This way it will erase in batches of 1000 files per step.

2 Comments

For a large number of files this seems significantly faster
This just reinvents xargs, rather poorly.
6

You can use this commend

find -name "*.pdf" -delete 

Comments

6

What about a shorter and more reliable one?

for i in **/*.pdf; do rm "$i"; done 

1 Comment

Excelent approach
5

The rm command has a limitation of files which you can remove simultaneous.

One possibility you can remove them using multiple times the rm command bases on your file patterns, like:

rm -f A*.pdf rm -f B*.pdf rm -f C*.pdf ... rm -f *.pdf 

You can also remove them through the find command:

find . -name "*.pdf" -exec rm {} \; 

4 Comments

No, rm has no such limit on the number of files it will process (other than that its argc cannot be larger than INT_MAX). It's the kernel's limitation on the maximum size of the entire argument array (that's why the length of the filenames is significant).
Almost right. But it's ARG_MAX (a kernel variable) rather than INT_MAX (a processor-related variable).
I suspect you didn't read my comment properly. ARG_MAX is the most arguments the kernel is willing to supply; the rm program would be quite happy to accept as many as INT_MAX arguments (e.g. when executed on a platform without such limitation).
This is a very nice solution. It doesn't change the original script by much and still makes possible to delete lots of files at once. For me it worked just fine.
4

Argument list too long

As this question title for cp, mv and rm, but answer stand mostly for rm.

Un*x commands

Read carefully command's man page!

For cp and mv, there is a -t switch, for target:

find . -type f -name '*.pdf' -exec cp -ait "/path to target" {} + 

and

find . -type f -name '*.pdf' -exec mv -t "/path to target" {} + 

Script way

There is an overall workaround used in script:

#!/bin/bash folder=( "/path to folder" "/path to anther folder" ) if [ "$1" != "--run" ] ;then exec find "${folder[@]}" -type f -name '*.pdf' -exec $0 --run {} + exit 0; fi shift for file ;do printf "Doing something with '%s'.\n" "$file" done 

Comments

3

Try this also If you wanna delete above 30/90 days (+) or else below 30/90(-) days files/folders then you can use the below ex commands

Ex: For 90days excludes above after 90days files/folders deletes, it means 91,92....100 days

find <path> -type f -mtime +90 -exec rm -rf {} \; 

Ex: For only latest 30days files that you wanna delete then use the below command (-)

find <path> -type f -mtime -30 -exec rm -rf {} \; 

If you wanna giz the files for more than 2 days files

find <path> -type f -mtime +2 -exec gzip {} \; 

If you wanna see the files/folders only from past one month . Ex:

find <path> -type f -mtime -30 -exec ls -lrt {} \; 

Above 30days more only then list the files/folders Ex:

find <path> -type f -mtime +30 -exec ls -lrt {} \; find /opt/app/logs -type f -mtime +30 -exec ls -lrt {} \; 

Comments

3

And another one:

cd /path/to/pdf printf "%s\0" *.[Pp][Dd][Ff] | xargs -0 rm 

printf is a shell builtin, and as far as I know it's always been as such. Now given that printf is not a shell command (but a builtin), it's not subject to "argument list too long ..." fatal error.

So we can safely use it with shell globbing patterns such as *.[Pp][Dd][Ff], then we pipe its output to remove (rm) command, through xargs, which makes sure it fits enough file names in the command line so as not to fail the rm command, which is a shell command.

The \0 in printf serves as a null separator for the file names wich are then processed by xargs command, using it (-0) as a separator, so rm does not fail when there are white spaces or other special characters in the file names.

2 Comments

While this code snippet may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post. Remember that you are answering the question for readers in the future, not just the person asking now! Please edit your answer to add explanation, and give an indication of what limitations and assumptions apply.
In particular, if printf isn't a shell builtin, it will be subject to the same limitation.
2

If you want to remove both files and directories, you can use something like:

echo /path/* | xargs rm -rf 

Comments

2

I was facing same problem while copying from source directory to destination

source directory had files ~3 lakcs

I used cp with option -r and it worked for me

cp -r abc/ def/ 

it will copy all files from abc to def without giving warning of Argument list too long

2 Comments

I don't know why someone downvoted this, without even commenting on that (that's policy, folks!). I needed to delete all files inside a folder (the question is not particular about PDFs, mind you), and for that, this trick is working well, all one has to do in the end is to recreate the folder that got deleted along when I used `rm -R /path/to/folder".
It works because in OP's case, he was using *, which expanded to a huge list of .pdf, giving a directory will cause this to be treated internally, thus, not having to deal with OP's issue. I think it was downvoted for that reason. It might not be usable for OP if he have nested directory or other files (not pdf) in his directory
2

To delete all *.pdf in a directory /path/to/dir_with_pdf_files/

mkdir empty_dir # Create temp empty dir rsync -avh --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/ 

To delete specific files via rsync using wildcard is probably the fastest solution in case you've millions of files. And it will take care of error you're getting.


(Optional Step): DRY RUN. To check what will be deleted without deleting.

rsync -avhn --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/ 

Click rsync tips and tricks for more rsync hacks

1 Comment

Fast and clever solution, worked really well!
1

I had the same problem with a folder full of temporary images that was growing day by day and this command helped me to clear the folder

find . -name "*.png" -mtime +50 -exec rm {} \; 

The difference with the other commands is the mtime parameter that will take only the files older than X days (in the example 50 days)

Using that multiple times, decreasing on every execution the day range, I was able to remove all the unnecessary files

Comments

1

I found that for extremely large lists of files (>1e6), these answers were too slow. Here is a solution using parallel processing in python. I know, I know, this isn't linux... but nothing else here worked.

(This saved me hours)

# delete files import os as os import glob import multiprocessing as mp directory = r'your/directory' os.chdir(directory) files_names = [i for i in glob.glob('*.{}'.format('pdf'))] # report errors from pool def callback_error(result): print('error', result) # delete file using system command def delete_files(file_name): os.system('rm -rf ' + file_name) pool = mp.Pool(12) # or use pool = mp.Pool(mp.cpu_count()) if __name__ == '__main__': for file_name in files_names: print(file_name) pool.apply_async(delete_files,[file_name], error_callback=callback_error) 

Comments

1

You can create a temp folder, move all the files and sub-folders you want to keep into the temp folder then delete the old folder and rename the temp folder to the old folder try this example until you are confident to do it live:

mkdir testit cd testit mkdir big_folder tmp_folder touch big_folder/file1.pdf touch big_folder/file2.pdf mv big_folder/file1,pdf tmp_folder/ rm -r big_folder mv tmp_folder big_folder 

the rm -r big_folder will remove all files in the big_folder no matter how many. You just have to be super careful you first have all the files/folders you want to keep, in this case it was file1.pdf

1 Comment

I actually found this page after running into file limitations for mv so I'm not confident this solves the problem.
0

I only know a way around this. The idea is to export that list of pdf files you have into a file. Then split that file into several parts. Then remove pdf files listed in each part.

ls | grep .pdf > list.txt wc -l list.txt 

wc -l is to count how many line the list.txt contains. When you have the idea of how long it is, you can decide to split it in half, forth or something. Using split -l command For example, split it in 600 lines each.

split -l 600 list.txt 

this will create a few file named xaa,xab,xac and so on depends on how you split it. Now to "import" each list in those file into command rm, use this:

rm $(<xaa) rm $(<xab) rm $(<xac) 

Sorry for my bad english.

3 Comments

If you have a file named pdf_format_sucks.docx this will be deleted as well... ;-) You should use proper and accurate regular expression when grepping for the pdf files.
Better, but still_pdf_format_sucks.docx will get deleted. The dot . in ".pdf" regular expression matches any character. I would suggest "[.]pdf$" instead of .pdf.
But this still fails on files with irregular names, partly because of the lack of quoting. If you really feel a need to reimplement xargs, you need to understand all the corner cases it handles.
0

I ran into this problem a few times. Many of the solutions will run the rm command for each individual file that needs to be deleted. This is very inefficient:

find . -name "*.pdf" -print0 | xargs -0 rm -rf 

I ended up writing a python script to delete the files based on the first 4 characters in the file-name:

import os filedir = '/tmp/' #The directory you wish to run rm on filelist = (os.listdir(filedir)) #gets listing of all files in the specified dir newlist = [] #Makes a blank list named newlist for i in filelist: if str((i)[:4]) not in newlist: #This makes sure that the elements are unique for newlist newlist.append((i)[:4]) #This takes only the first 4 charcters of the folder/filename and appends it to newlist for i in newlist: if 'tmp' in i: #If statment to look for tmp in the filename/dirname print ('Running command rm -rf '+str(filedir)+str(i)+'* : File Count: '+str(len(os.listdir(filedir)))) #Prints the command to be run and a total file count os.system('rm -rf '+str(filedir)+str(i)+'*') #Actual shell command print ('DONE') 

This worked very well for me. I was able to clear out over 2 million temp files in a folder in about 15 minutes. I commented the tar out of the little bit of code so anyone with minimal to no python knowledge can manipulate this code.

2 Comments

What's the point of a Python script if you end up shelling out with os.system() anyway? You want os.unlink() instead; then you don't have to solve the quoting problems which this fails to solve properly. But the only reason this is more efficient than find is that it doesn't recurse into subdirectories; you can do the same with printf '%s\0' /tmp/* | xargs -r0 rm -rf or of course by adding a -maxdepth 1 option to the find command.
(Actually find will probably be quicker because the shell will alphabetize the list of files matched by the wildcard, which could be a significant amount of work when there are a lot of matches.)
0

I have faced a similar problem when there were millions of useless log files created by an application which filled up all inodes. I resorted to "locate", got all the files "located"d into a text file and then removed them one by one. Took a while but did the job!

1 Comment

This is pretty vague and requires you to have installed locate back when you still had room on your disk.
0

I solved with for

I am on macOS with zsh

I moved thousands only jpg files. Within mv in one line command.

Be sure there are no spaces or special characters in the name of the files you are trying to move
for i in $(find ~/old -type f -name "*.jpg"); do mv $i ~/new; done 

Comments

0

If you want to pass the glob pattern to a bash script, pass it as a string

script.sh "**/*.pdf" 

and enable glob patterns in the script shopt -s extglob globstar

#!/bin/bash shopt -s extglob globstar for file in $1; do ... done 

Comments

0

This should be relevant: I found that I am getting "Argument list too long" for something like this:

find /some/very/long/path/my-directory -type f | wc -l 

But NOT "Argument list too long" if I first cd into directory:

cd /some/very/long/path; find my-directory -type f | wc -l 

Because the total output is shorter when printed path is shorter.

/some/very/long/path/my-directory/file.txt vs my-directory/file.txt 

Comments

-2

A bit safer version than using xargs, also not recursive: ls -p | grep -v '/$' | grep '\.pdf$' | while read file; do rm "$file"; done

Filtering our directories here is a bit unnecessary as 'rm' won't delete it anyway, and it can be removed for simplicity, but why run something that will definitely return error?

2 Comments

This is not safe at all, and does not work with file names with newlines in them, to point out one obvious corner case. Parsing ls is a common antipattern which should definitely be avoided, and adds a number of additional bugs here. The grep | grep is just not very elegant.
Anyway, it's not like this is a new and exotic problem which requires a complex solution. The answers with find are good, and well-documented here and elsewhere. See e.g. the mywiki.wooledge.org for much more on this and related topics.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.