1

I am attempting to parallelise a for-loop using Java streams & ForkJoinPool in order to control the number of threads used. When run with a single thread, the parallelised code returns the same result as the sequential program. The sequential code is a set of standard for-loops:

for(String file : fileList){ for(String item : xList){ for(String x : aList) { // action code } } } 

And the following is my parallel implementation:

ForkJoinPool threadPool = new ForkJoinPool(NUM_THREADS); int chunkSize = aList.size()/NUM_THREADS; for(String file : fileList){ for(String item : xList){ IntStream.range(0, NUM_THREADS) .parallel().forEach(i -> threadPool.submit(() -> { aList.subList(i*chunkSize, Math.min(i*chunkSize + chunkSize -1, aList.size()-1)) .forEach(x -> { // action code }); })); threadPool.shutdown(); threadPool.awaitTermination(5, TimeUnit.MINUTES); } } 

When using more than 1 thread, only a limited number of iterations are completed. I have attempted to use .shutdown() and .awaitTermination() to ensure completion of all threads, however this doesn't seem to work. The number of iterations that occur difference dramatically from run to run (between 0-1500).

Note: I'm using a Macbook Pro with 8 available cores (4 dual-cores), and my action code does not contain references that make parallelisation unsafe.

Any advice would be much appreciated, thank you!

14
  • 1
    Why do you use IntStream.parallel and a fork join pool? Commented Oct 23, 2018 at 13:19
  • I saw it mentioned a few times in various places (incl. StackOverflow answers 1, 2). Although I'm now thinking that System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "4") may be the better way to go. Still interested if anyone can explain to me why the original code did not work! Commented Oct 23, 2018 at 13:29
  • I don't know how your reply answers my questions. IntStream.parallel will cause a stream to be processed in parallel. The threadPool.submit will submit tasks that will be run in parallel. You don't need to submit these tasks in parallel. Also, you don't need to call invoke submit will start the tasks. Commented Oct 23, 2018 at 13:41
  • Also, with the 'invoke()' in your IntStream processing you can be swallowing runtime exceptions, which could stop the processing of your IntRange. Commented Oct 23, 2018 at 13:48
  • 1
    You use an IntStream, which gives you the required i. This still doesn’t explain, why you are using parallel() on it. Commented Oct 23, 2018 at 15:47

1 Answer 1

2

I think the actual problem you have is caused by your calling shutdown on the ForkJoinPool. If you look into the javadoc, this results in "an orderly shutdown in which previously submitted tasks are executed, but no new tasks will be accepted" - ie. I'd expect only one task to actually finish.

BTW there's no real point in using a ForkJoinPool the way you use it. A ForkJoinPool is intended to split workload recursively, not unlike you do with your creating sublists in the loop - but a ForkJoinPool is supposed to be fed by RecursiveActions that split their work themselves, rather than splitting it up beforehand like you do in a loop. That's just a side note though; your code should run fine, but it would be clearer if you just submitted your tasks to a normal ExecutorService, eg one you get by Executors.newFixedThreadPool(parallelism) rather than by new ForkJoinPool().

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.