8

I have a directory with 100,000 files and I need to iterate them all to read a value. Right now I use listFiles() to load all files in a array and then iterate one by one. But is there a memory efficient way to do this without loading in a array?

File[] tFiles = new File(Dir).listFiles(); try { for (final File tFile : tFiles) { //Process files one by one } } 
1
  • An answer I gave earlier might help You have to change some functionality but using streams might be more efficient. Not sure about the performance, though. Commented Nov 6, 2015 at 16:15

3 Answers 3

9

Since Java 7, you can use the file visitor pattern to visit the contents of a directory recursively.

The documentation for the FileVisitor interface is here.

This allows you to iterate over files without creating a large array of File objects.

Simple example to print out your file names:

Path start = Paths.get(new URI("file:///my/folder/")); Files.walkFileTree(start, new SimpleFileVisitor<Path>() { @Override public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException { System.out.println(file); return FileVisitResult.CONTINUE; } @Override public FileVisitResult postVisitDirectory(Path dir, IOException e) throws IOException { if (e == null) { System.out.println(dir); return FileVisitResult.CONTINUE; } else { // directory iteration failed throw e; } } }); 
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! this was exactly what I was looking for :)
3

Java 8 lazily loaded stream version:

Files.list(new File("path to directory").toPath()).forEach(path -> { File file = path.toFile(); //process your file }); 

1 Comment

This will leak file descriptors. Per API documentation you need to close the stream afterwards. Better use try (Stream<Path> files = Files.list(Paths.get("path to dir"))) { files.forEach(...); }
2

If you want to avoid the excessive boilerplate that comes with JDK's FileVisitor, you can use Guava. Files.fileTreeTraverser() gives you a TreeTraverser<File> which you can use for traversing the files in the folder (or even sub-folders):

for (File f : Files.fileTreeTraverser() .preOrderTraversal(new File("/parent/folder"))) { // do something with each file } 

4 Comments

This internally calls Collections.unmodifiableList(Arrays.asList(files));, i.e. I don't think this is better as the code from the question itself.
@jan, depends on what you mean by "better". I like Guava's TreeTraverser because it is a really powerful abstraction that makes it possible to do your thing succinctly and readably, leaving less room for bugs. Yes, it might not be the most performant solution, but in most of cases this is probably not the bottleneck​ of the application. This may hold true even in OP's case with 100k files. I would first use the simplest possible solution and only optimize for performance if the simplest possible solution is not good enough.
In other circumstances I would totally agree, but as answer/comment in this question regarding an efficient solution I cannot agree. And there might be even more files as 100k.
I agree with jan - this is about how to make it scalable in terms of number of files. Your solution is the same as listFiles() under the hood.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.