2

I am processing a plenty of 4k images by calculating a parameter on small (64X64 pixel) patches of the image. The task is now being carried out in a sequential fashion, one patch at a time. A snippet of my code is copied below to show you the idea.

for (int i = 0; i < imageW / pSize; i++) { for (int j = 0; j < imageH / pSize; j++) { thisPatch = MatrixUtil.getSubMatrixAsMatrix(image, i * pSize, j * pSize, pSize); results[i][j] = computeParamForPatch(thisPatch); } } 

I now need to parallelize this to possibly save some time. As you can see, the process for each patch is completely independent from all others. To do so, I either need to remember the location of each patch by using a Map or use forEachOrdered(). Unfortunately I don't think using maps, something like Map<Point, double[][]> would be parallelized. So this is my question: Apart from using forEachOrdered() which affects the performance negatively, is there any other way to process an image in parallel?


One solution: I tried the following code (suggested by @DHa) which makes a significant improvement:

int outputW = imageW / pSize; int outputH = imageH / pSize; IntStream.range(0, outputW * outputH).parallel().forEach(i -> { int x = (i % outputW); int y = (i / outputH); tDirectionalities[x][y] = computeDirectionalityForPatch( MatrixUtil.computeParamForPatch(image, x * pSize, y * pSize, pSize)); }); 

Results:

  • Sequential: 15754 ms
  • Parallel: 5899 ms
10
  • ExecutorService Commented Apr 29, 2018 at 15:42
  • 2
    Why parallelize this? It seems like it'd be a lot easier to process the images in parallel instead of trying to process each image in parallel while still processing the images sequentially. Commented Apr 29, 2018 at 15:43
  • 1
    @AndrewHenle because then IO would be even more of a bottleneck. Commented Apr 29, 2018 at 16:28
  • 1
    A 4K image is heavy on memory so processing them one at a time might clearly be beneficial. Commented Apr 29, 2018 at 16:59
  • 1
    @DHa how are some 20 MB "heavy on memory"? Commented Apr 29, 2018 at 17:01

1 Answer 1

2

This solution uses a parallel stream.

See also How many threads are spawned in parallelStream in Java 8 for how to control amount of threads that work on the stream simultaneously.

 int patchWidth = (int)Math.ceil((double)imageW / pSize); int patchHeight = (int)Math.ceil((double)imageH / pSize); IntStream.range(0, patchWidth * patchHeight).parallel().forEach(i -> { int x = (i % patchWidth); int y = (i / patchWidth); thisPatch = MatrixUtil.getSubMatrixAsMatrix(image, x * pSize, y * pSize, pSize); results[x][y] = computeParamForPatch(thisPatch); }); 
Sign up to request clarification or add additional context in comments.

2 Comments

I made a silly mistake in my test (this is why I removed me comment here). In fact, the parallel version you proposed indeed makes it significantly faster. See my update.
@Azim From the timing results you've given it looks like you are now employing 3 cores on it, parallel() will default to use all - 1 core, so if you use the techniques in the link to increase that to all cores you could increase the result a little bit further, at the cost of possibly starving other threads from CPU time. Perhaps a better approach is to use the remaining core for IO operations that handle input/output to this function.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.