2

I am trying to delete several files within a directory.

So far I have that code:

 for filename in glob.glob("buffer*" ): os.remove(filename) for filename in glob.glob("grid*" ): os.remove(filename) for filename in glob.glob("OSMroads*" ): os.remove(filename) for filename in glob.glob("newCostSurface*" ): os.remove(filename) for filename in glob.glob("standsLine*" ): os.remove(filename) for filename in glob.glob("standsReprojected*" ): os.remove(filename) 

Is there a way to do this more efficient?

3
  • What files are you trying to delete? Is that the easiest way to group them? What are their filenames? We need more information. Commented Jan 3, 2014 at 23:34
  • What do you mean by "more efficient"? More efficient for humans to read/update/maintain your code? More efficient for the computer to run? Or something different? Commented Jan 3, 2014 at 23:41
  • I mean both: More efficient to read and more efficient so my code executes faster. Commented Jan 3, 2014 at 23:43

2 Answers 2

6

I like using lists so I don't repeat code, like this:

for pattern in ['buffer*','grid*','OSMroads*','newCostSurface*','standsLine*' 'standsReprojected*']: for filename in glob.glob(pattern): os.remove(filename) 
Sign up to request clarification or add additional context in comments.

Comments

4

Doing 6 separate glob calls will of course iterate the directory object 6 times.

Fortunately, on almost any platform, it'll probably end up being cached after the first time. Unless your directory is absolutely gigantic, this won't be a noticeable problem.

But since you explicitly asked about efficiency, you can obviously iterate once and filter the results. The easiest way to do this is with fnmatch. All that glob is doing is calling listdir and then fnmatch on each result; you can do the same thing with multiple fnmatch calls:

for filename in os.listdir('.'): if fnmatch.fnmatch(filename, 'buffer*'): os.remove(filename) # etc. 

And of course you can simplify this in exactly the same way partofthething simplified your existing code:

for filename in os.listdir('.'): for pattern in ['buffer*', 'grid*', 'OSMroads*', 'newCostSurface*','standsLine*', 'standsReprojected*']: if fnmatch.fnmatch(filename, pattern): os.remove(filename) 

Or:

for filename in os.listdir('.'): if any(fnmatch.fnmatch(filename, pattern) for pattern in ['buffer*', 'grid*', 'OSMroads*', 'newCostSurface*','standsLine*', 'standsReprojected*']): os.remove(filename) 

If you really need to squeeze out another tiny fraction of a percent performance, you can use fnmatch.translate to convert each pattern to a regexp, then merge the regexps into an alternation, and compile it, and then apply that regexp object to each filename. But the CPU time for fnmatch compared to the I/O time for reading the directory objects is probably so small the improvement wouldn't even be measurable.

2 Comments

Very nice answer. I'm a lazy typist myself, I'd use "buffer grid OSMroads etc".split() instead of the list, and pattern+'*' when matching. Is it pythonic or just too lazy?
@ManuelGutierrez: Well, if that's too lazy, then I'd say you're in good company. Notice that stdlib types like namedtuple and enum effectively have that split() built-in, because everyone in the namedtuple discussions (including most of the core devs) was writing namedtuple('Point', 'x y z'.split()), and someone suggested that was worth making even shorter.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.