1

For a multi Template Matching with OpenCvSharp I want to use Parallel.ForEach() and limit the maximal number of threads used to n-1 so at least one thread remains free for other possible occuring tasks (n is the total nr of CPU-level threads available).

E.g.: My system has a 4 Core CPU with 2 Threads per Core. So the max amount here should be 7.

How can I do this without hardcoding it? (It should work also for other PCs with another number of threads)

 // Use a maximum of n-1 threads int maxNrOfThreads = 7; Parallel.ForEach(this.Symbols.Keys, new ParallelOptions { MaxDegreeOfParallelism = maxNrOfThreads }, key => { //EXECUTION (template match every symbol to a given image) }); 

Here is the execution part for who may be interessted:

Mat grayTemplate = this.Symbols[key].GrayscaledSymbol; Mat res = new Mat(grayImage.Rows - grayTemplate.Rows + 1,grayImage.Cols - grayTemplate.Cols + 1, MatType.CV_32FC1); Cv2.MatchTemplate(grayImage, grayTemplate, res, TemplateMatchModes.CCoeffNormed); Cv2.Threshold(res, res, MIN_ACCURACY, 1.0, ThresholdTypes.Tozero); while (true) { double minval, maxval, threshold = MIN_ACCURACY; OpenCvSharp.Point minloc, maxloc; Cv2.MinMaxLoc(res, out minval, out maxval, out minloc, out maxloc); if (maxval >= threshold) { // Add bounding box to result object ret.AddResult(key,maxloc.X,maxloc.Y,grayTemplate.Width,grayTemplate.Height); // Fill in the res Mat so you don't find the same area again in the MinMaxLoc Rect outRect; Cv2.FloodFill(res, maxloc, new Scalar(0), out outRect, new Scalar(0.1), new Scalar(1.0)); } else { break; } } 
3
  • Why should it be 7? Parallel.ForEach will detect the current number of cores, start with that number of tasks and adjust itself based on the actual load. No matter how many worker tasks (not threads) are used, the OS won't let other processes starve. You don't have to do anything unless you want to explicitly throttle or increase the number of threads. Commented Jul 25, 2022 at 15:11
  • ProcessorCount will include "virtual" cores if Hyperthreading is used, so the Core-7 heuristic is meaningless. Parallel.ForEach uses cooperative multitasking so no task can monopolize a core. Unless the Action itself takes too long. So what are you trying to do? And what does Execution do? Commented Jul 26, 2022 at 6:52
  • The DOP depends more on the number of partitions than the MaxDOP. If you pass an array or IList<> derived container, the Count will be used to partition the data into static ranges. After that, each task will go to work on a single partition. If you have 7 partitions, you won't get more than 7 concurrent tasks. In any case you never get threads, you get Tasks that run from 100ms up to 100 +50*(ProcCount) before yielding. Unless your Action is taking longer than that Commented Jul 26, 2022 at 7:00

1 Answer 1

4

You're looking for Environment.ProcessorCount.

Sign up to request clarification or add additional context in comments.

23 Comments

There's usually no reason to check the processor count. Parallel.ForEach itself uses that value but will adjust the number of worker tasks based on load. The OS won't let other processes starve either.
@PanagiotisKanavos the Parallel.ForEach doesn't adjust the number of worker tasks based on load. It just grabs aggressively all the ThreadPool threads that are currently available. For example if the ThreadPool happens to have 100 available threads at the moment, the Parallel.ForEach will use all these 100 threads, and will ask for more. IMHO this is terrible, and makes specifying explicitly the ParallelOptions.MaxDegreeOfParallelism a very good idea.
Parallel.ForEach in general is terrible because it doesn't understand Task objects. Task.WhenAll is an almost drop-in replacement that understands modern async code, and coupled with correct functions on your end will prevent starvation in a much more efficient way. That said, you should be telling OP that, this isn't news to me.
@Blindy quite the opposite. It uses tasks instead of threads. You refer to asynchronous operations instead, for which it was explicitly not built. That's not the method's fault. It was explicitly built for in-memory data parallelization and works to optimize that scenario - it partitions the source data, then uses a worker task per partition, to ensure minimal synchronization between workers. This is clear and explicit since 2010. It definitely doesn't use the threadpool to decide how many partitions to use. Task.WhenAll isn't even close
The documentation says something different from any of you. "The degree of parallelism is automatically managed by the underlying components of the system; the implementation of the Parallel class, the default task scheduler, and the .NET thread pool all play a role in optimizing throughput under a wide range of conditions."
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.