9

Say I want to make parallel API post requests.

In a for loop I can append the http post call into a list of tasks, (each task invoked using Task.Run) and then wait for all to finish using await Task.WhenAll. Thus the control will go to caller while waiting for the network request to complete. Effectively the API request will be made in parallel.

Similarly I can use Parallel.ForEachAsync which will automatically do the WhenAll and return control to caller. So I want to ask whether ForEachAsync is a replacement to a plain for loop list (async await Task.Run) and WhenAll?

9
  • 2
    No, it's not. Parallel.ForEach does a lot more than just use multiple tasks. - it partitions the data so that each worker task won't have to synchronize with others to access the data. Then it uses as many workers as there are cores to process those partitions. There's little point in starting 100 workers if there are only 4 cores. The other 96 workers will simply do nothing except add to the scheduling overhead Commented Jul 27, 2021 at 11:57
  • which will automatically do the WaitAll that's not what happens. Parallel will use the current thread to process data, and since all cores are busy crunching data, it appears as if the thread is "blocked". It's not Commented Jul 27, 2021 at 11:58
  • 3
    In fact, an ActionBlock would be a lot better than a loop and WaitAll. With an ActionBlock you can limit the number of concurrent connections easily. Neither servers nor clients have infinite bandwidth or CPU, so trying to send 100 HTTP requests concurrently can easily be slower than making just 10 at a time Commented Jul 27, 2021 at 12:03
  • 1
    PS: ForEachAsync returns a Task, so it behaves as if you called WhenAll, not WaitAll Commented Jul 27, 2021 at 12:04
  • 1
    I found the Github issue where ForEachAsync was discussed and it sounds like partitioning is not used. ForEach and ForEachAsync pass state between workers and iterations though, something not possible with either a loop of tasks or `ActionBlock. And as the source shows it doesn't just start some tasks. Commented Jul 27, 2021 at 12:21

1 Answer 1

16

No, the Parallel.ForEachAsync API has quite a lot of differences compared to a trivial use of the Task.WhenAll API:

  1. The elephant in the room: the await Task.WhenAll returns an array with the results of the asynchronous operations. On the contrary the Parallel.ForEachAsync returns a naked Task. If you want the results you must rely on side-effects, like updating a ConcurrentQueue<T> as part of the asynchronous operation.

  2. The Parallel.ForEachAsync invokes the supplied asynchronous delegate in parallel, on ThreadPool threads (configurable). On the contrary the common pattern of using the Task.WhenAll is to create the Tasks sequentially, on the current thread. This raises concerns about using the Parallel.ForEachAsync in ASP.NET applications, where offloading work on the ThreadPool might have scalability implications.

  3. The Parallel.ForEachAsync invokes the asynchronous delegate and awaits the generated tasks, while enforcing a maximum level of concurrency equal to Environment.ProcessorCount. This behavior is configurable through the MaxDegreeOfParallelism option. On the contrary the common pattern of using the Task.WhenAll is to create all the tasks at once, imposing no concurrency limitation.

  4. The common pattern of using the Task.WhenAll is to assume that creating all the tasks is impossible to fail midway, and so to take no precautions against this possibility. In case this actually happens, fire-and-forget tasks might be leaked. This is not possible with the Parallel.ForEachAsync API.

  5. The Parallel.ForEachAsync will stop invoking the asynchronous delegate as soon as the first error occurs on either an asynchronous delegate invocation, or a created Task, and then propagates a failure containing all the errors that have occurred so far, after awaiting all the already created tasks. It also provides a mechanism for canceling the other tasks that are in-flight when the error occurs (the CancellationToken that is passed as second argument in the lambda). On the contrary the Task.WhenAll waits invariably for all the tasks to complete. This means that you might have to wait for a lot longer, before eventually receiving an AggregateException containing the errors of all the tasks that have failed.

Sign up to request clarification or add additional context in comments.

6 Comments

I have posted here a ForEachAsync variant that returns results.
Hi, nice answer, just wondering do you know the performance difference between the two? I am more interested in speed.
the reason I ask is that stackoverflow.com/questions/72778189/… not sure which way to go, or do you have any ideas thanks.
@AndySong probably the Parallel.ForEachAsync as a mechanism has more overhead than the Task.WhenAll, but the difference should be in the scale of nanoseconds. The overhead of either mechanism is unlikely to have any noticeable effect in the performance of your app though. On the other hand the Parallel.ForEachAsync by being able to control the degree of parallelism might result in a smoother communication pattern with your database, resulting in big time performance improvements. If your database is happy, your app will be happy too!
"This raises concerns about using the Parallel.ForEachAsync in ASP.NET applications, where offloading work on the ThreadPool might have scalability implications." To confirm, you talking specifically about scenarios in which your delegate contains CPU-intensive code prior to await, right? Meaning using ForEachAsync would clog up multiple CPU cores while kicking off the collection of tasks, whereas you'd normally kick the tasks off one at a time and avoid that issue.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.