This tutorial describes the model-based tracking of object using simultaneously multiple cameras views. It allows to track the object in the images viewed by a set of cameras while providing its 3D localization. Calibrated cameras (intrinsic and extrinsic between the reference and the other cameras) are required.
The mbt ViSP module allows the tracking of a markerless object using the knowledge of its CAD model. Considered objects have to be modeled by segment, circle or cylinder primitives. The model of the object could be defined in vrml format (except for circles) or in cao (our own format).
Next section highlights the different versions of the markerless multi-view model-based trackers that have been developed. The multi-view model-based tracker can consider moving-edges (thanks to the vpMbEdgeMultiTracker class). It can also consider KLT features that are detected and tracked on each visible face of the model (thanks to the vpMbKltMultiTracker class). The tracker can also handle moving-edges and KLT features in a hybrid scheme (thanks to vpMbEdgeKltMultiTracker the class).
While the multi-view model-based edges tracker implemented in vpMbEdgeMultiTracker is appropriate to track texture-less objects (with visible edges), the multi-view model-based KLT tracker implemented in vpMbKltMultiTracker is suitable for textured objects. The multi-view model-based hybrid tracker implemented in vpMbEdgeKltMultiTracker is appropriate to track objects with texture and or with visible edges.
These classes allow the tracking of the same object assuming two or more cameras: The main advantages of this configuration with respect to the mono-camera case (see Tutorial: Markerless model-based tracking (deprecated)) concern:
the possibility to extend the application field of view;
a more robust tracking as the configuration of the stereo rig allows to track the object under multiple viewpoints and thus with more visual features.
In order to achieve this, the following information are required:
the intrinsic parameters of each camera;
the transformation matrix between each camera and a reference camera: .
In the following sections, we consider the tracking of a tea box modeled in cao format. A stereo camera sees this object. The following video shows the tracking performed with vpMbEdgeMultiTracker. In this example ,the fixed cameras located on the Romeo Humanoid robot head captured the images.
This other video shows the behavior of the hybrid tracking performed with vpMbEdgeKltMultiTracker.
Note
The cameras can move, but the tracking will be effective as long as the transformation matrix between the cameras and the reference camera is known and updated at each iteration.
The new introduced classes are not restricted to stereo configuration. They allow the usage of multiple cameras (see How to deal with moving cameras).
Next sections will highlight how to easily adapt your code to use multiple cameras with the model-based tracker. As only the new methods dedicated to multiple views tracking will be presented, you are highly recommended to follow Tutorial: Markerless model-based tracking (deprecated) in order to be familiar with the model-based tracking concepts, the different trackers that are available in ViSP (the edge tracker: vpMbEdgeTracker, the klt feature points tracker: vpMbKltTracker and the hybrid tracker: vpMbEdgeKltTracker) and with the configuration part.
Note that all the material (source code and video) described in this tutorial is part of ViSP source code and could be downloaded using the following command:
The model-based trackers available for multiple views tracking rely on the same trackers than in the monocular case:
a vpMbEdgeMultiTracker similar to vpMbEdgeTracker which tracks moving-edges corresponding to the visible lines of the model projected in the image plane at the current pose (suitable for texture-less objects).
a vpMbKltMultiTracker similar to vpMbKltTracker which uses the optical flow information to track the object (suitable for textured objects).
a vpMbEdgeKltMultiTracker similar to vpMbEdgeKltTracker which merges the two information (edge and texture information) for better robustness of the tracking (can deal with both types of objects).
The following class diagram offers an overview of the hierarchy between the different classes:
Simplified class diagram.
The vpMbEdgeMultiTracker class inherits from the vpMbEdgeTracker class, the vpMbKltMultiTracker inherits from the vpMbKltTracker class and the vpMbEdgeKltMultiTracker class inherits from the vpMbEdgeMultiTracker and vpMbKltMultiTracker classes. This conception permits to easily extend the usage of the model-based tracker to multiple cameras with the guarantee to preserve the same behavior compared to the tracking in the monocular configuration (more precisely, only the model-based edge and the model-based klt should have the same behavior, the hybrid multi class has a slight different implementation that will lead to minor differences compared to vpMbEdgeKltTracker).
As you will see after, the principal methods present in the parent class are accessible and used for single view tracking. Lot of new overridden methods have been introduced to deal with the different cameras configuration (single camera, stereo cameras and multiple cameras).
Implementation detail
Each tracker is stored in a map, the key corresponding to the name of the camera on which the tracker will process. By default, the camera names are set to:
"Camera" when the tracker is constructed with one camera.
"Camera1" to "CameraN" when the tracker is constructed with N cameras.
The default reference camera will be "Camera1" in the multiple cameras case.
Default name convention and reference camera ("Camera1").
To deal with multiple cameras, in the virtual visual servoing control law we concatenate all the interaction matrices and residual vectors and transform them in a single reference camera frame to compute the reference camera velocity. Thus, we have to know the transformation matrix between each camera and the reference camera.
For example, if the reference camera is "Camera1" ( ), we need the following information: .
Interfacing with the code
Each essential method used to initialize the tracker and process the tracking have three signatures in order to ease the call to the method and according to three working modes:
tracking using two cameras, all the necessary methods accept directly the corresponding parameter for each camera. By default, the first parameter corresponds to the reference camera.
tracking using multiple cameras, you have to supply the different parameters with a map. The key corresponds to the name of the camera and the value to the parameter.
The following table sums up how to call the different methods based on the camera configuration for the main functions.
As the trackers are stored in an alphabetic order internally, you have to match the method parameters with the correct tracker position in the map in the stereo cameras case.
Example code
The following example comes from tutorial-mb-tracker-stereo.cpp and allows to track a tea box modeled in cao format using one of the three multi-view markerless trackers implemented in ViSP. In this example we consider a stereo configuration.
Once built, to choose which tracker to use, run the binary with the following argument:
std::cout << "klt and hybrid model-based tracker are not available "
"since visp_klt module is missing"
<< std::endl;
return 0;
}
#endif
Note
We used a pointer to vpMbTracker to be able to construct a tracker according to the desired type (edge, klt or hybrid) but you could directly declare the desired tracker class in your program.
All the configuration parameters for the tracker are stored in xml configuration files. To load the different files, we use:
We have to set the transformation matrices between the cameras and the reference camera to be able to compute the control law in a reference camera frame. In the code we consider the left camera with the name "Camera1" as the reference camera. For the right camera with the name "Camera2" we have to set the transformation ( ). This transformation is read from cRightMcLeft.txt file. Since our left and right cameras are not moving, this transformation is constant and has not to be updated in the tracking loop:
Note
For the reference camera, the camera transformation matrix has to be specified as an identity homogeneous matrix (no rotation, no translation). By default the vpHomogeneousMatrix constructor builds an identity matrix.
The principle remains the same than with static cameras. You have to supply the camera transformation matrices to the tracker each time the cameras move and before calling the track method:
mapOfCamTrans["Camera1"] = vpHomogeneousMatrix(); //The Camera1 is the reference camera.
mapOfCamTrans["Camera2"] = get_c2Mc1(); //Get the new transformation between the two cameras.
This information can be available through the robot kinematics or using different kind of sensors.
The following video shows the stereo hybrid model-based tracking based on object edges and KLT features located on visible faces. The result of the tracking is then used to servo the Romeo humanoid robot eyes to gaze toward the object. The images were captured by cameras located in the Romeo eyes.