5

I have this opencv image processing function being called 4x on 4 diferent Mat objects.

void processBinary(Mat& binaryMat) { //image processing } 

I want to multi-thread it so that all 4 method calls complete at the same time, but have the main thread wait until each thread is done.

Ex:

int main() { Mat m1, m2, m3, m4; //perform each of these methods simultaneously, but have main thread wait for all processBinary() calls to finish processBinary(m1); processBinary(m2); processBinary(m3); processsBinary(m4); } 

What I hope to accomplish is to be able to call processBinary() as many times as I need and have the same efficiency as having the method called only once. I have looked up multithreading, but am a little confused on calling threads and then joining / detaching them. I believe I need to instantiate each thread and then call join() on each thread so that the main thread waits for each to execute, but there doesn't seem to be a significant increase in execution time. Can anyone explain how I should go about multi-threading my program? Thanks!

EDIT: What I have tried:

//this does not significantly increase execution time. However, calling processBinary() only once does.4 thread p1(&Detector::processBinary, *this, std::ref(m1)); thread p2(&Detector::processBinary, *this, std::ref(m2)); thread p3(&Detector::processBinary, *this, std::ref(m3)); thread p4(&Detector::processBinary, *this, std::ref(m4)); p1.join(); p2.join(); p3.join(); p4.join(); 
6
  • 1
    The work you have described is a pipeline. Each function is taking the output of the previous stage. To achieve parallelism you will need to be able to move smaller pieces of work between stages of your workflow. Said differently, how can they do work in parallel if they require output from another task. Copy.. Findcontours... Draw contours.... Commented Jul 26, 2016 at 16:28
  • 1
    I'm not sure I understand. I want to put processBinary() in parallel because it is being called 4x , not the code inside the method. Each processBinary() is called on a different Mat object, so they do not depend on each other. Commented Jul 26, 2016 at 16:42
  • 1
    Ah... You should add the code that calls processBinary. We don't need the internals of processBinary to help you parallelize the calling of it. Commented Jul 26, 2016 at 16:44
  • Sorry about my poor explanation; I updated the question so it makes more sense :) Commented Jul 26, 2016 at 16:45
  • What have you tried? If processBinary is a pure function, you can just spawn 4 standard threads and join them. Commented Jul 26, 2016 at 16:51

2 Answers 2

6

The slick way to achieve this is not to do the thread housekeeping yourself but use a library that provides micro-parallelization.

OpenCV itself uses Intel Thread Building Blocks (TBB) for exactly this task -- running loops in parallel.

In your case, your loop has just four iterations. With C++11, you can write it down very easily using a lambda expression. In your example:

std::vector<cv::Mat> input = { m1, m2, m3, m4; } tbb::parallel_for(size_t(0), input.size(), size_t(1), [=](size_t i) { processBinary(input[i]); }); 

For this example I took code from here.

Sign up to request clarification or add additional context in comments.

6 Comments

I will give this a try. I am actually trying to compile this on android through the ndk, and it doesn't seem to recognize <tbb/tbb.h> include. I have checked the sdk, libtbb.a is in there... Perhaps my Android.mk is not set up correct. Would you happen to know anything about this? Might warrant another question.
No idea about that.
@SumeetBatra Intel tbb is a separate library. Opencv only uses it, if you build it with tbb support. Else it uses whatever threading library you compiled it with. This can be pthreads, gdb etc.
@nnrales That's not utterly correct. TBB serves a different purpose than pthreads. And what kind of threading library is gdb?
@nnrales TBB and GCD are one layer above pthreads. TBB actually uses pthreads itself where applicable. If you don't have TBB or GCD you cannot expect the same parallelization from OpenCV.
|
0

In case, you're using python language, then you can use my powerful open-source built-in multi-threaded vidgear OpenCV's wrapper python library available on GitHub and PyPI for achieving higher FPS.

Project Insight:

VidGear is a lightweight python wrapper around OpenCV Video I/O module that contains powerful multi-thread modules(gears) to enable high-speed video frames capture functionality across various devices and platforms.

Features:

Key features which differentiate it from the other existing multi-threaded open source solutions are:

  • Multi-Threaded high-speed OpenCV video-frame capturing(resulting in High FPS)

  • Flexible Direct control over the video stream with easy manipulation ability

  • Lightweight

  • Built-in Robust Error and frame synchronization Handling

  • Multi-Platform compatibility(Compatible with Raspberry Pi Camera also.)

  • Full Support for Network Video Streams(Including Gstreamer Raw Video Capture Pipeline)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.