2

I have the following method below that used to initialize and start log files processing by creating a different task for each file. In the Task Manager I can see this function is creating about one hundred threads this way.

public async Task LogProcessing(DescriptorList files, CancellationToken ct) { var tasks = new List<Task>(); foreach (var file in files) tasks.Add(new Task(() => { ParsingFile(file, ct); })); foreach (var task in tasks) { task.Start(); } await Task.WhenAll(tasks); } 

How can I prevent creating more threads than the available number of hardware threads? Thanks.

8
  • Why don't u use threadpool? ThreadPool.QueueUserWorkItem Also from the performance point of view is better because of the reused threads. (dotnettutorials.net/lesson/thread-pooling) Commented Oct 27, 2023 at 13:45
  • @DA: Does task created this way will not run on the on ThreadPool? Commented Oct 27, 2023 at 13:48
  • 2
    @Da From your own link "The default scheduler for the Task Parallel Library and PLINQ uses the .NET thread pool" Commented Oct 27, 2023 at 14:04
  • 1
    Why not just use a Parallel.For/foreach-loop? That should try to partition the work to limit the number of threads used, and also allow for a explicit limit for the number of threads. Commented Oct 27, 2023 at 14:06
  • 2
    Also note that you need to be careful when trying to parallelize work that involves reading from files. Spinning disks hate random IO, and even SSDs prefer sequential reads. Commented Oct 27, 2023 at 14:11

1 Answer 1

2

If you are using .NET 6.0 or later you could use Parallel.ForEachAsync() to do this:

await Parallel.ForEachAsync( files, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount, CancellationToken = ct }, async (file, cancellation) => { await Task.Run(() => ParsingFile(file, cancellation), cancellation); }); 

Note the use of MaxDegreeOfParallelism = Environment.ProcessorCount to limit the number of concurrent threads to the (logical) processor count. This is actually usually the default, so you may not need to set this at all, but some folks might like to do so for clarity.

According to the documentation "Generally, you do not need to modify this setting" but you should look at this answer from Theodor Zoulias before using the default. (I generally don't use the default myself).

I recommend reading the documentation on MaxDegreeOfParallelism and deciding for yourself whether to specify it or not.

As per TheodorZoulias's comments below, you can write this more simply as:

await Parallel.ForEachAsync( files, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount, CancellationToken = cancelSource.Token }, (file, cancellation) => { ParsingFile(file, cancellation); return ValueTask.CompletedTask; }); 

However, if ParsingFile() was itself async you would of course use the first version.

Sign up to request clarification or add additional context in comments.

11 Comments

Actually for the Parallel.ForEachAsync API, as well as the anticipated in .NET 8 Parallel.ForAsync API, the MaxDegreeOfParallelism = Environment.ProcessorCount is the default. See the docs.
Also the await Task.Run(() => is redundant. The default TaskScheduler of the Parallel.ForEachAsync method is the ThreadPool anyway. You can just call the ParsingFile, and then return ValueTask.CompletedTask (no async).
@TheodorZoulias I'll add that to the answer.
@Jackdaw btw although the MaxDegreeOfParallelism = Environment.ProcessorCount is the default, configuring it explicitly is better than relying on the default. I would not advise to remove this configuration from Matthew's answer.
@TheodorZoulias: I'm thinking less about people reading your posts and more about "if it's that good an idea, it will appear in some Fortune 500 companies' coding standards, and then you'll have people doing it merely because it is in the coding standard without putting in any thought" Whether the person who wrote the coding standard say your post or independently thought about it and reached the same conclusion.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.