Visual Servoing Platform  version 3.1.0 under development (2017-06-22)
Tutorial: How to use multi-threading capabilities

Introduction

After ViSP 3.0.0, we introduce a new cross-platform vpThread class that allows to execute a function in a separate thread. We also improve vpMutex class useful to protect shared data by mutexes to be cross-platform.

The vpThread and vpMutex classes are wrappers over native pthread functionality when pthread is available. This is the case for all unix-like OS, including OSX and MinGW under Windows. If pthread is not available, we use Windows native functionality instead.

Threading overview

To use vpThread class you have first to include the corresponding header.

#include <visp3/core/vpThread.h>

With vpThread the prototype of the function vpThread::Fn that could be executed in a separate thread is the following:

vpThread::Return myFooFunction(vpThread::Args args)

where arguments passed to the function are of type vpThread::Args. This function should return a vpThread::Return type.

Then to create the thread that executes this function, you have just to construct a vpThread object indicating which is the function to execute.

To illustrate this behavior, see testThread.cpp.

Mutexes overview

To use vpMutex class you have first to include the corresponding header.

#include <visp3/core/vpMutex.h>

Then protecting a shared var from concurrent access could be done like:

vpMutex mutex;
int var = 0;
mutex.lock();
// var to protect from concurrent access
var = 2;
mutex.unlock();

To illustrate this usage, see testMutex.cpp.

There is also a more elegant way using vpMutex::vpScopedLock. The previous example becomes:

vpMutex mutex;
int var = 0;
{
vpMutex::vpScopedLock lock(mutex);
// var to protect from concurrent access
var = 2;
}

Here, the vpMutex::vpScopedLock constructor locks the mutex, while the destructor unlocks. Using vpMutex::vpScopedLock, the scope of the portion of code that is protected is defined inside the brackets. To illustrate this usage, see tutorial-grabber-opencv-threaded.cpp.

Pass multiple arguments and / or retrieve multiple return values

This section will show you one convenient way to pass multiple arguments to a vpThread and retrieve multiple return values at the end of the computation. This example (testThread2.cpp) uses a functor class to do that.

Basically, you declare a class that will act like a function by defining the operator() that will do the computation in a dedicated thread. In the following toy example, we want to compute the element-wise addition ( $ v_{add}\left [ i \right ] = v_1 \left [ i \right ] + v_2 \left [ i \right ] $) and the element-wise multiplication ( $ v_{mul}\left [ i \right ] = v_1 \left [ i \right ] \times v_2 \left [ i \right ] $) of two vectors.

Each thread will process a subset of the input vectors and the partial results will be stored in two vectors (one for the addition and the other one for the multiplication).

class ArithmFunctor {
public:
ArithmFunctor(const vpColVector &v1, const vpColVector &v2, const unsigned int start, const unsigned int end) :
m_add(), m_mul(), m_v1(v1), m_v2(v2), m_indexStart(start), m_indexEnd(end) {
}
ArithmFunctor() : m_add(), m_mul(), m_v1(), m_v2(), m_indexStart(0), m_indexEnd(0) {
}
void operator()() {
computeImpl();
}
vpColVector getVectorAdd() const {
return m_add;
}
vpColVector getVectorMul() const {
return m_mul;
}
private:
vpColVector m_add;
vpColVector m_mul;
unsigned int m_indexStart;
unsigned int m_indexEnd;
void computeImpl() {
m_add.resize(m_indexEnd - m_indexStart);
m_mul.resize(m_indexEnd - m_indexStart);
//to simulate a long computation
for (int iter = 0; iter < 100; iter++) {
for (unsigned int i = m_indexStart, cpt = 0; i < m_indexEnd; i++, cpt++) {
m_add[cpt] = m_v1[i] + m_v2[i];
m_mul[cpt] = m_v1[i] * m_v2[i];
}
}
}
};

The required arguments needed by the constructor are the two input vectors, the start index and the end index that will define the portion of the vector to be processed by the current thread. Two getters are used to retrieve the results at the end of the computation.

Let's see now how to create and initialize the threads:

std::vector<vpThread> threads(nb_threads);
std::vector<ArithmFunctor> functors(nb_threads);
unsigned int split = size / nb_threads;
for (unsigned int i = 0; i < nb_threads; i++) {
if (i < nb_threads-1) {
functors[i] = ArithmFunctor(v1, v2, i*split, (i+1)*split);
} else {
functors[i] = ArithmFunctor(v1, v2, i*split, size);
}
threads[i].create((vpThread::Fn) arithmThread, (vpThread::Args) &functors[i]);
}

The pointer to the routine arithmThread() called by the thread is defined as the following:

vpThread::Return arithmThread(vpThread::Args args) {
ArithmFunctor* f = static_cast<ArithmFunctor*>(args);
(*f)();
return 0;
}

This routine is called by the threading library. We cast the argument passed to the thread routine and we call the function that needs to be executed by the thread.

To get the results:

vpColVector res_add, res_mul;
for (size_t i = 0; i < nb_threads; i++) {
threads[i].join();
insert(res_add, functors[i].getVectorAdd());
insert(res_mul, functors[i].getVectorMul());
}

After joining the threads, the partial results from one thread can be obtained by a call to the appropriate getter function.

Warning
You cannot create directly the thread as the following:
threads[i] = vpThread((vpThread::Fn) arithmThread, (vpThread::Args) &functors[i]);

nor as the following:

threads.push_back(vpThread((vpThread::Fn) arithmThread, (vpThread::Args) &functors[i]));

as theses lines of code create a temporary vpThread object that will be copied to the vector and after destructed. The destructor of the vpThread calls automatically the join() function and thus it will result that the threads will be created, started and joined sequentially as soon as the temporary vpThread object will be destructed.

Multi-threaded capture and display

Note that all the material (source code) described in this section is part of ViSP source code and could be downloaded using the following command:

$ svn export https://github.com/lagadic/visp.git/trunk/tutorial/grabber

The following example implemented in tutorial-grabber-opencv-threaded.cpp shows how to implement a multi-threaded application, where image capture is executed in one thread and image display in an other one. The capture is here performed thanks to OpenCV cv::VideoCapture class. It could be easily adapted to deal with other framegrabbers available in ViSP. In tutorial-grabber-v4l2-threaded.cpp you will find the same example using vpV4l2Grabber. To adapt the code to other framegrabbers see Tutorial: Image frame grabbing.

Hereafter we explain how tutorial-grabber-opencv-threaded.cpp works.

Includes and declarations

First we include all ViSP headers corresponding to the classes we will use; vpImageConvert to convert OpenCV images in ViSP images, vpMutex to protect shared data between the threads, vpThread to create the threads, vpTime to handle the time, vpDisplayX to display images under unix-like OS, and vpDisplayGDI to display the images under Windows.

Then if OpenCV 2.1.0 or higher is found we include OpenCV highgui.hpp header that brings cv::VideoCapture class that will be used in this example for image capture.

We declare then the shared data with variable names prefixed by "s_" (s_capture_state, indicating if capture is in progress or is stopped, s_frame the image that is currently captured and s_mutex_capture, the mutex that will be used to protect from concurrent access to these shared variables).

#include <iostream>
#include <visp3/core/vpImageConvert.h>
#include <visp3/core/vpMutex.h>
#include <visp3/core/vpThread.h>
#include <visp3/core/vpTime.h>
#include <visp3/gui/vpDisplayX.h>
#include <visp3/gui/vpDisplayGDI.h>
#if (VISP_HAVE_OPENCV_VERSION >= 0x020100) && (defined(VISP_HAVE_PTHREAD) || defined(_WIN32))
#include <opencv2/highgui/highgui.hpp>
// Shared vars
typedef enum {
capture_waiting,
capture_started,
capture_stopped
} t_CaptureState;
t_CaptureState s_capture_state = capture_waiting;
cv::Mat s_frame;
vpMutex s_mutex_capture;

Capture thread

Then we implement captureFunction(), the capture function that we want to run in a separate thread. As argument this function receives a reference over cv::VideoCapture object that was created in the Main thread.

Note
We notice that cv::VideoCapture is unable to create an instance outside the Main thread. That's why cv::VideoCapture object is passed throw the arguments of the function captureFunction(). With ViSP vp1394TwoGrabber, vp1394CMUGrabber, vpFlyCaptureGrabber, vpV4l2Grabber capture classes it would be possible to instantiate the object in the capture function.

We check if the capture is able to found a camera thanks to cap.isOpened(), and start a 30 seconds capture loop that will fill frame_ with the image from the camera. The capture could be stopped before 30 seconds if stop_capture_ boolean is turned to true. Once an image is captured, with the mutex we update the shared data. After the while loop, we also update the capture state to capture_stopped to finish the display thread.

vpThread::Return captureFunction(vpThread::Args args)
{
cv::VideoCapture cap = *((cv::VideoCapture *) args);
if(!cap.isOpened()) { // check if we succeeded
std::cout << "Unable to start capture" << std::endl;
return 0;
}
cv::Mat frame_;
int i=0;
while ((i++ < 100) && !cap.read(frame_)) {}; // warm up camera by skiping unread frames
bool stop_capture_ = false;
double start_time = vpTime::measureTimeSecond();
while ((vpTime::measureTimeSecond() - start_time) < 30 && !stop_capture_) {
// Capture in progress
cap >> frame_; // get a new frame from camera
// Update shared data
{
vpMutex::vpScopedLock lock(s_mutex_capture);
if (s_capture_state == capture_stopped)
stop_capture_ = true;
else
s_capture_state = capture_started;
s_frame = frame_;
}
}
{
vpMutex::vpScopedLock lock(s_mutex_capture);
s_capture_state = capture_stopped;
}
std::cout << "End of capture thread" << std::endl;
return 0;
}

Display thread

We implement then displayFunction() used to display the captured images. This function doesn't exploit any argument. Depending on the OS we create a display pointer over the class that we want to use (vpDisplayX or vpDisplayGDI). We enter then in a while loop that will end when the capture is stopped, meaning that the Capture thread is finished.

In the display loop, with the mutex we create a copy of the shared variables s_capture_state in order to use if just after. When capture is started we convert the OpenCV cv::mat image into a local ViSP image I. Since we access to the shared s_frame data, the conversion is protected by the mutex. Then with the first available ViSP image I we initialize the display and turn display_initialized_ boolean to false indicating that the display is already initialized. Next we update the display with the content of the image. When we capture is not started, we just sleep for 2 milli-seconds.

vpThread::Return displayFunction(vpThread::Args args)
{
(void)args; // Avoid warning: unused parameter args
t_CaptureState capture_state_;
bool display_initialized_ = false;
#if defined(VISP_HAVE_X11)
vpDisplayX *d_ = NULL;
#elif defined(VISP_HAVE_GDI)
vpDisplayGDI *d_ = NULL;
#endif
do {
s_mutex_capture.lock();
capture_state_ = s_capture_state;
s_mutex_capture.unlock();
// Check if a frame is available
if (capture_state_ == capture_started) {
// Get the frame and convert it to a ViSP image used by the display class
{
vpMutex::vpScopedLock lock(s_mutex_capture);
}
// Check if we need to initialize the display with the first frame
if (! display_initialized_) {
// Initialize the display
#if defined(VISP_HAVE_X11)
d_ = new vpDisplayX(I_);
display_initialized_ = true;
#elif defined(VISP_HAVE_GDI)
d_ = new vpDisplayGDI(I_);
display_initialized_ = true;
#endif
}
// Display the image
// Trigger end of acquisition with a mouse click
vpDisplay::displayText(I_, 10, 10, "Click to exit...", vpColor::red);
if (vpDisplay::getClick(I_, false)) {
vpMutex::vpScopedLock lock(s_mutex_capture);
s_capture_state = capture_stopped;
}
// Update the display
}
else {
vpTime::wait(2); // Sleep 2ms
}
} while(capture_state_ != capture_stopped);
#if defined(VISP_HAVE_X11) || defined(VISP_HAVE_GDI)
delete d_;
#endif
std::cout << "End of display thread" << std::endl;
return 0;
}

Main thread

The main thread is the one that is implemented in the main() function. We manage first the command line option "--device <camera device>" to allow the user to select a specific camera when more then one camera are connected. Then as explained in Capture thread we need the create cv::VideoCapture object in the main(). Finally, captureFunction() and displayFunction() are started as two separate threads, one for the capture, an other one for the display using vpThread constructor.

The call to join() is here to wait until capture and display thread ends to return from the main().

int main(int argc, const char* argv[])
{
int opt_device = 0;
// Command line options
for (int i=0; i<argc; i++) {
if (std::string(argv[i]) == "--device")
opt_device = atoi(argv[i+1]);
else if (std::string(argv[i]) == "--help") {
std::cout << "Usage: " << argv[0] << " [--device <camera device>] [--help]" << std::endl;
return 0;
}
}
// Instanciate the capture
cv::VideoCapture cap;
cap.open(opt_device);
// Start the threads
vpThread thread_capture((vpThread::Fn)captureFunction, (vpThread::Args)&cap);
vpThread thread_display((vpThread::Fn)displayFunction);
// Wait until thread ends up
thread_capture.join();
thread_display.join();
return 0;
}

Once build, to run this tutorial just run in a terminal:

cd <visp-build-tree>/tutorial/grabber
./tutorial-grabber-opencv-threaded --help
./tutorial-grabber-opencv-threaded --device 0

where "--device 0" could be avoided since it is the default option.

Extension to face detection

Note that all the material (source code) described in this section is part of ViSP source code and could be downloaded using the following command:

$ svn export https://github.com/lagadic/visp.git/trunk/tutorial/detection/face

The example given in the previous section Multi-threaded capture and display could be extended to introduce an image processing. In this section, we illustrate the case of the face detection described in Tutorial: Face detection and implemented in tutorial-face-detector-live.cpp as a single main thread. Now we propose to extend this example using multi-threading where face detection is achieved in a separate thread. The complete source code is given in tutorial-face-detector-live-threaded.cpp.

Here after we give the changes that we introduce in tutorial-face-detector-live-threaded.cpp to add a new thread dedicated to the face detection.

Face detection thread

The function that does the face detection is implemented in detectionFunction(). We first instantiate an object of type vpDetectorFace. Then in the while loop, we call the face detection function using face_detector_.detect() when a new image is available. When faces are found, we retrieve the bounding box of the first face that is the largest in the image. We update the shared s_face_bbox var with the bounding box. This var is then exploited in the display thread and displayed as a rectangle.

vpThread::Return detectionFunction(vpThread::Args args)
{
std::string opt_face_cascade_name = *((std::string *) args);
vpDetectorFace face_detector_;
face_detector_.setCascadeClassifierFile(opt_face_cascade_name);
t_CaptureState capture_state_;
#if defined(VISP_HAVE_V4L2)
#elif defined(VISP_HAVE_OPENCV)
cv::Mat frame_;
#endif
do {
s_mutex_capture.lock();
capture_state_ = s_capture_state;
s_mutex_capture.unlock();
// Check if a frame is available
if (capture_state_ == capture_started) {
// Backup the frame
{
vpMutex::vpScopedLock lock(s_mutex_capture);
frame_ = s_frame;
}
// Detect faces
bool face_found_ = face_detector_.detect(frame_);
if (face_found_) {
vpMutex::vpScopedLock lock(s_mutex_face);
s_face_available = true;
s_face_bbox = face_detector_.getBBox(0); // Get largest face bounding box
}
}
else {
vpTime::wait(2); // Sleep 2ms
}
} while(capture_state_ != capture_stopped);
std::cout << "End of face detection thread" << std::endl;
return 0;
}

Main thread

The main() is modified to call the detectionFunction() in a third thread.

Note
Compared to the Main thread used in tutorial-grabber-opencv-threaded.cpp, we modify here the main() to be able to capture images either from a webcam when Video For Linux 2 (V4L2) is available (only on Linux-like OS), or using OpenCV cv::VideoCapture when V4L2 is not available.
int main(int argc, const char* argv[])
{
std::string opt_face_cascade_name = "./haarcascade_frontalface_alt.xml";
unsigned int opt_device = 0;
unsigned int opt_scale = 2; // Default value is 2 in the constructor. Turn it to 1 to avoid subsampling
for (int i=0; i<argc; i++) {
if (std::string(argv[i]) == "--haar")
opt_face_cascade_name = std::string(argv[i+1]);
else if (std::string(argv[i]) == "--device")
opt_device = (unsigned int)atoi(argv[i+1]);
else if (std::string(argv[i]) == "--scale")
opt_scale = (unsigned int)atoi(argv[i+1]);
else if (std::string(argv[i]) == "--help") {
std::cout << "Usage: " << argv[0] << " [--haar <haarcascade xml filename>] [--device <camera device>] [--scale <subsampling factor>] [--help]" << std::endl;
return 0;
}
}
// Instanciate the capture
#if defined(VISP_HAVE_V4L2)
std::ostringstream device;
device << "/dev/video" << opt_device;
cap.setDevice(device.str());
cap.setScale(opt_scale);
#elif defined(VISP_HAVE_OPENCV)
cv::VideoCapture cap;
cap.open(opt_device);
# if (VISP_HAVE_OPENCV_VERSION >= 0x030000)
int width = (int)cap.get(cv::CAP_PROP_FRAME_WIDTH);
int height = (int)cap.get(cv::CAP_PROP_FRAME_HEIGHT);
cap.set(cv::CAP_PROP_FRAME_WIDTH, width/opt_scale);
cap.set(cv::CAP_PROP_FRAME_HEIGHT, height/opt_scale);
# else
int width = cap.get(CV_CAP_PROP_FRAME_WIDTH);
int height = cap.get(CV_CAP_PROP_FRAME_HEIGHT);
cap.set(CV_CAP_PROP_FRAME_WIDTH, width/opt_scale);
cap.set(CV_CAP_PROP_FRAME_HEIGHT, height/opt_scale);
# endif
#endif
// Start the threads
vpThread thread_capture((vpThread::Fn)captureFunction, (vpThread::Args)&cap);
vpThread thread_display((vpThread::Fn)displayFunction);
vpThread thread_detection((vpThread::Fn)detectionFunction, (vpThread::Args)&opt_face_cascade_name);
// Wait until thread ends up
thread_capture.join();
thread_display.join();
thread_detection.join();
return 0;
}

To run the binary just open a terminal and run:

cd <visp-build-tree>/tutorial/detection/face
./tutorial-face-detector-live-threaded --help
./tutorial-face-detector-live-threaded