I have the following code:
if (!this.writeDataStore.Exists(mat)) { BlockingCollection<ImageFile> imageFiles = new BlockingCollection<ImageFile>(); Parallel.ForEach(fileGrouping, fi => DecompressAndReadGzFile(fi, imageFiles)); this.PushIntoDb(mat, imageFiles.ToList()); } DecompressAndReadGzFile is a static method in the same class that this method is contained in. As per the method name I am decompressing and reading gz files, lots of them, i.e. up to 1000, so the overhead of parallelisation is worth it for the benefits. However, I'm not seeing the benefits. When I use ANTS performance profiler I see that they are running at exactly the same times as if no parallelisation is occuring. I also check the CPU cores with process explorer and it looks like there is possibly work being done on two cores but one core seems to be doing most of the work. What am I not understanding as far as getting Parallel.ForEach to decompress and read files in parallel?
UPDATED QUESTION: What is the fastest way to read information in from a list of files?
The Problem (simplified):
- There is a large list of .gz files (1200).
- Each file has a line containing "DATA: ", the location and line number are not static and can vary from file to file.
- We need to retrieve the first number after "DATA: " (just for simplicity's sake) and store it in an object in memory (e.g. a List)
In the initial question, I was using the Parallel.ForEach loop but I didn't seem to be CPU bound on more than 1 core.
DecompressAndReadGzFile?