For a multi Template Matching with OpenCvSharp I want to use Parallel.ForEach() and limit the maximal number of threads used to n-1 so at least one thread remains free for other possible occuring tasks (n is the total nr of CPU-level threads available).
E.g.: My system has a 4 Core CPU with 2 Threads per Core. So the max amount here should be 7.
How can I do this without hardcoding it? (It should work also for other PCs with another number of threads)
// Use a maximum of n-1 threads int maxNrOfThreads = 7; Parallel.ForEach(this.Symbols.Keys, new ParallelOptions { MaxDegreeOfParallelism = maxNrOfThreads }, key => { //EXECUTION (template match every symbol to a given image) }); Here is the execution part for who may be interessted:
Mat grayTemplate = this.Symbols[key].GrayscaledSymbol; Mat res = new Mat(grayImage.Rows - grayTemplate.Rows + 1,grayImage.Cols - grayTemplate.Cols + 1, MatType.CV_32FC1); Cv2.MatchTemplate(grayImage, grayTemplate, res, TemplateMatchModes.CCoeffNormed); Cv2.Threshold(res, res, MIN_ACCURACY, 1.0, ThresholdTypes.Tozero); while (true) { double minval, maxval, threshold = MIN_ACCURACY; OpenCvSharp.Point minloc, maxloc; Cv2.MinMaxLoc(res, out minval, out maxval, out minloc, out maxloc); if (maxval >= threshold) { // Add bounding box to result object ret.AddResult(key,maxloc.X,maxloc.Y,grayTemplate.Width,grayTemplate.Height); // Fill in the res Mat so you don't find the same area again in the MinMaxLoc Rect outRect; Cv2.FloodFill(res, maxloc, new Scalar(0), out outRect, new Scalar(0.1), new Scalar(1.0)); } else { break; } }
Parallel.ForEachwill detect the current number of cores, start with that number of tasks and adjust itself based on the actual load. No matter how many worker tasks (not threads) are used, the OS won't let other processes starve. You don't have to do anything unless you want to explicitly throttle or increase the number of threads.ProcessorCountwill include "virtual" cores if Hyperthreading is used, so the Core-7 heuristic is meaningless.Parallel.ForEachuses cooperative multitasking so no task can monopolize a core. Unless theActionitself takes too long. So what are you trying to do? And what doesExecutiondo?IList<>derived container, theCountwill be used to partition the data into static ranges. After that, each task will go to work on a single partition. If you have 7 partitions, you won't get more than 7 concurrent tasks. In any case you never get threads, you get Tasks that run from100ms up to 100 +50*(ProcCount)before yielding. Unless yourActionis taking longer than that