Introduction
Getting started
Tracker CAD model
Tracker initialization
- Initialization by user click
- Initialization by external pose
Tracker settings
Advanced
Use case
Known issues
- Model-based trackers examples are not working with Ogre visibility check
- Model-based trackers tutorials are not working with Ogre visibility check
Next tutorial

Introduction

ViSP allows simultaneously the tracking of a markerless object using the knowledge of its CAD model while providing its 3D localization (i.e., the object pose expressed in the camera frame) when a calibrated camera is used [8], [45]. Considered objects should be modeled by lines, circles or cylinders. The CAD model of the object could be defined in vrml format (except for circles), or in cao format (a home-made format).

This tutorial focuses on vpMbGenericTracker class that was introduced in ViSP 3.1.0. This class brings a generic way to consider different kind of visual features used as measures by the model-based tracker and allows also to consider either a single camera or multiple cameras observing the object to track. This class replaces advantageously the usage of the following classes vpMbEdgeTracker, vpMbKltTracker or the one mixing edges and keypoints vpMbEdgeKltTracker that will continue to exist in ViSP but that we don't recommend to use, since switching from one class to an other may be laborious. If for one reason or another you still want to use these classes, we invite you to follow Tutorial: Markerless model-based tracking (deprecated).

In this tutorial, we will show how to use vpMbGenericTracker class in order to track an object from images acquires by a monocular color camera using either moving edges, either keypoints, or either a combination of them using an hybrid scheme. To illustrate this tutorial we will consider that the object to track is a tea box.

Note that all the material (source code, input video, CAD model or XML settings files) described in this tutorial is part of ViSP source code and could be downloaded using the following command:

$ svn export https://github.com/lagadic/visp.git/trunk/tutorial/tracking/model-based/generic

Features overview

Considering the use case of a monocular color camera, the tracker implemented in vpMbGenericTracker class allows to consider a combination of the following visual features:

moving edges: image points tracked along the visible edges defined in the CAD model (line, face, cylinder and circle primitives) [8]. This feature is appropriate to track texture-less objects (with visible edges)
keypoints: they are tracked on the visible object faces using KLT tracker (face and cylinder primitives) [39]. This feature is suitable for textured objects

The moving-edges and KLT features require a RGB camera but note that these features operate on grayscale image.

Note also that combining the visual features (moving edges + keypoints) can be a good way to improve the tracking robustness.

Considered third-parties

Depending on your use case the following optional third-parties may be used by the tracker. Make sure ViSP was build with the appropriate 3rd parties:

OpenCV: Essential if you want to use the keypoints as visual features that are detected and tracked thanks to the KLT tracker. This 3rd party may be also useful to consider input videos (mpeg, png, jpeg...).
Ogre 3D: This 3rd party is optional and could be difficult to install on OSX and Windows. To begin with the tracker we don't recommend to install it. Ogre 3D allows to enable Advanced visibility via Ogre3D.
Coin 3D: This 3rd party is also optional and difficult to install. That's why we don't recommend to install Coin 3D to begin with the tracker. Coin 3D allows only to consider CAD model in wrml format instead of the home-made CAD model in cao format.

Input images data

For classical features working on grayscale image, you have to use the following data type:

vpImage<unsigned char> I;

You can convert to a grayscale image if the image acquired is in RGBa data type:

vpImage<vpRGBa> I_color;
// Color image acquisition
vpImage<unsigned char> I;
vpImageConvert::convert(I_color, I);

Since ViSP 3.2.1, it is also possible to consider color images as input with the following data type:

vpImage<vpRGBa> I;

Note: If you consider color images as intput, the time requested by the tracker to process one image will increase since there is a conversion from the color to a gray level image used in the tracker low level layers.

Getting started

To start with the generic markerless model-based tracker we recommend to understand the tutorial-mb-generic-tracker.cpp source code that is given and explained below.

Example input/output data

The tutorial-mb-generic-tracker.cpp example uses the following data as input:

a video file; teabox.mpg is the default video.
a cad model that describes the object to track. In our case the file teabox.cao is the default one. See Tracker CAD model section to learn how the teabox is modeled and section CAD model in cao format to learn how to model an other object.
a file with extension *.init that contains the 3D coordinates of some points used to compute an initial pose which serves to initialize the tracker. The user has than to click in the image on the corresponding 2D points. The default file is named teabox.init. The content of this file is detailed in Initialization by user click section.
an optional image with extension *.ppm that may help the user to remember the location of the corresponding 3D points specified in *.init file. To know more about this file see Initialization by user click section.

As an output the tracker provides the pose $^c {\bf M}_o$ corresponding to a 4 by 4 matrix that corresponds to the geometric transformation between the frame attached to the object (in our case the tea box) and the frame attached to the camera. The pose is return as a vpHomogeneousMatrix container.

Example source code

The following example that comes from tutorial-mb-generic-tracker.cpp allows to track a tea box modeled in cao format using either moving edges of keypoints as visual features.

#include <visp3/core/vpIoTools.h>
#include <visp3/gui/vpDisplayGDI.h>
#include <visp3/gui/vpDisplayOpenCV.h>
#include <visp3/gui/vpDisplayX.h>
#include <visp3/io/vpImageIo.h>
#include <visp3/mbt/vpMbGenericTracker.h>
#include <visp3/io/vpVideoReader.h>
int main(int argc, char **argv)
{
#if defined(VISP_HAVE_OPENCV) && (VISP_HAVE_OPENCV_VERSION >= 0x020100)
  try {
    std::string opt_videoname = "model/teabox/teabox.mp4";
    std::string opt_modelname = "model/teabox/teabox.cao";
    int opt_tracker = 1;
    for (int i = 0; i < argc; i++) {
      if (std::string(argv[i]) == "--video")
        opt_videoname = std::string(argv[i + 1]);
      else if (std::string(argv[i]) == "--model")
        opt_modelname = std::string(argv[i + 1]);
      else if (std::string(argv[i]) == "--tracker")
        opt_tracker = atoi(argv[i + 1]);
      else if (std::string(argv[i]) == "--help" || std::string(argv[i]) == "-h") {
        std::cout << "\nUsage: " << argv[0]
                  << " [--video <video name>] [--model <model name>]"
                     " [--tracker <0=egde|1=keypoint|2=hybrid>] [--help] [-h]\n"
                  << std::endl;
        return 0;
      }
    }
    std::string parentname = vpIoTools::getParent(opt_modelname);
    std::string objectname = vpIoTools::getNameWE(opt_modelname);
    if (!parentname.empty())
      objectname = parentname + "/" + objectname;
    std::cout << "Video name: " << opt_videoname << std::endl;
    std::cout << "Tracker requested config files: " << objectname << ".[init, cao]" << std::endl;
    std::cout << "Tracker optional config files: " << objectname << ".[ppm]" << std::endl;
    vpImage<unsigned char> I;
    vpCameraParameters cam;
    vpHomogeneousMatrix cMo;
    vpVideoReader g;
    g.setFileName(opt_videoname);
    g.open(I);
    vpDisplay *display = NULL;
#if defined(VISP_HAVE_X11)
    display = new vpDisplayX;
#elif defined(VISP_HAVE_GDI)
    display = new vpDisplayGDI;
#else
    display = new vpDisplayOpenCV;
#endif
    display->init(I, 100, 100, "Model-based tracker");
    vpMbGenericTracker tracker;
    if (opt_tracker == 0)
      tracker.setTrackerType(vpMbGenericTracker::EDGE_TRACKER);
#ifdef VISP_HAVE_MODULE_KLT
    else if (opt_tracker == 1)
      tracker.setTrackerType(vpMbGenericTracker::KLT_TRACKER);
    else
      tracker.setTrackerType(vpMbGenericTracker::EDGE_TRACKER | vpMbGenericTracker::KLT_TRACKER);
#else
    else {
      std::cout << "klt and hybrid model-based tracker are not available since visp_klt module is not available. "
                   "In CMakeGUI turn visp_klt module ON, configure and build ViSP again."
                << std::endl;
      return 0;
    }
#endif
    if (opt_tracker == 0 || opt_tracker == 2) {
      vpMe me;
      me.setMaskSize(5);
      me.setMaskNumber(180);
      me.setRange(8);
      me.setThreshold(10000);
      me.setMu1(0.5);
      me.setMu2(0.5);
      me.setSampleStep(4);
      tracker.setMovingEdge(me);
    }
#ifdef VISP_HAVE_MODULE_KLT
    if (opt_tracker == 1 || opt_tracker == 2) {
      vpKltOpencv klt_settings;
      klt_settings.setMaxFeatures(300);
      klt_settings.setWindowSize(5);
      klt_settings.setQuality(0.015);
      klt_settings.setMinDistance(8);
      klt_settings.setHarrisFreeParameter(0.01);
      klt_settings.setBlockSize(3);
      klt_settings.setPyramidLevels(3);
      tracker.setKltOpencv(klt_settings);
      tracker.setKltMaskBorder(5);
    }
#endif
    cam.initPersProjWithoutDistortion(839, 839, 325, 243);
    tracker.setCameraParameters(cam);
    tracker.loadModel(objectname + ".cao");
    tracker.setDisplayFeatures(true);
    tracker.initClick(I, objectname + ".init", true);
    while (!g.end()) {
      g.acquire(I);
      vpDisplay::display(I);
      tracker.track(I);
      tracker.getPose(cMo);
      tracker.getCameraParameters(cam);
      tracker.display(I, cMo, cam, vpColor::red, 2);
      vpDisplay::displayFrame(I, cMo, cam, 0.025, vpColor::none, 3);
      vpDisplay::displayText(I, 10, 10, "A click to exit...", vpColor::red);
      vpDisplay::flush(I);
      if (vpDisplay::getClick(I, false))
        break;
    }
    vpDisplay::getClick(I);
    delete display;
  } catch (const vpException &e) {
    std::cout << "Catch a ViSP exception: " << e << std::endl;
  }
#else
  (void)argc;
  (void)argv;
  std::cout << "Install OpenCV and rebuild ViSP to use this example." << std::endl;
#endif
}

Note: An extension of the previous getting started example is proposed in tutorial-mb-generic-tracker-full.cpp where advanced functionality such as reading tracker settings from an XML file or visibility computation are implemented.

Running the example

Once build, to see the options that are available in the previous source code, just run:

$ ./tutorial-mb-generic-tracker --help

Usage: ./tutorial-mb-generic-tracker [--video <video name>] [--model <model name>] [--tracker <0=egde|1=keypoint|2=hybrid>] [--help]

By default, model/teabox/teabox.mpg video and model/teabox/teabox.cao model are used as input. Using "--tracker" option, you can specify which tracker has to be used:

Using "--tracker 0" to track only moving-edges:
$ ./tutorial-mb-generic-tracker --tracker 0
will produce results similar to:
Using "--tracker 1" to track only keypoints:
$ ./tutorial-mb-generic-tracker --tracker 1
will produce results similar to:
Using "--tracker 2" to track moving-edges and keypoints in an hybrid scheme:
$ ./tutorial-mb-generic-tracker --tracker 2
will produce results similar to:

With this example it is also possible to work on an other data set using "--video" and "--model" command line options. For example, if you run:

$ ./tutorial-mb-generic-tracker --video <path1>/myvideo%04.png --model <path2>/myobject.cao.

it means that the following images will be used as input:

<path1>/myvideo0001.png
<path1>/myvideo0002.png
...
<path1>/myvideo0009.png
<path1>/myvideo0010.png
...

and that in <path2> you have the following data:

myobject.init: The coordinates of at least four 3D points used for the initialization.
myobject.cao: The CAD model of the object to track.
myobject.ppm: An optional image that shows where the user has to click the points defined in myobject.init. Supported image format are png, ppm, png, jpeg.
myobject.xml: An optional files that contains the tracker parameters that are specific to the image sequence and that contains also the camera intrinsic parameters obtained by calibration (see Tutorial: Camera intrinsic calibration). This file is handled in tutorial-mb-generic-tracker-full.cpp but not in tutorial-mb-generic-tracker.cpp. That's why since the video teabox.mpg was acquired by an other camera than yours, you have to set the camera intrinsic parameters in tutorial-mb-generic-tracker.cpp source code modifying the line:
cam.initPersProjWithoutDistortion(839, 839, 325, 243);
and build again before using "--model ..." command line option.

Source code explained

Hereafter is the description of the some lines introduced in the previous example.

First we include the header of the generic tracker.

#include <visp3/mbt/vpMbGenericTracker.h>

The tracker uses image I and the intrinsic camera parameters cam as input.

vpImage<unsigned char> I;

vpCameraParameters cam;

As output, it estimates cMo, the pose of the object in the camera frame.

vpHomogeneousMatrix cMo;

Once input image teabox.pgm is loaded in I, a window is created and initialized with image I. Then we create an instance of the tracker depending on "--tracker" command line option.

    vpMbGenericTracker tracker;
    if (opt_tracker == 0)
      tracker.setTrackerType(vpMbGenericTracker::EDGE_TRACKER);
#ifdef VISP_HAVE_MODULE_KLT
    else if (opt_tracker == 1)
      tracker.setTrackerType(vpMbGenericTracker::KLT_TRACKER);
    else
      tracker.setTrackerType(vpMbGenericTracker::EDGE_TRACKER | vpMbGenericTracker::KLT_TRACKER);
#else
    else {
      std::cout << "klt and hybrid model-based tracker are not available since visp_klt module is not available. "
                   "In CMakeGUI turn visp_klt module ON, configure and build ViSP again."
                << std::endl;
      return 0;
    }
#endif

Then the corresponding tracker settings are initialized. More details are given in Tracker settings section.

    if (opt_tracker == 0 || opt_tracker == 2) {
      vpMe me;
      me.setMaskSize(5);
      me.setMaskNumber(180);
      me.setRange(8);
      me.setThreshold(10000);
      me.setMu1(0.5);
      me.setMu2(0.5);
      me.setSampleStep(4);
      tracker.setMovingEdge(me);
    }
#ifdef VISP_HAVE_MODULE_KLT
    if (opt_tracker == 1 || opt_tracker == 2) {
      vpKltOpencv klt_settings;
      klt_settings.setMaxFeatures(300);
      klt_settings.setWindowSize(5);
      klt_settings.setQuality(0.015);
      klt_settings.setMinDistance(8);
      klt_settings.setHarrisFreeParameter(0.01);
      klt_settings.setBlockSize(3);
      klt_settings.setPyramidLevels(3);
      tracker.setKltOpencv(klt_settings);
      tracker.setKltMaskBorder(5);
    }
#endif
    cam.initPersProjWithoutDistortion(839, 839, 325, 243);
    tracker.setCameraParameters(cam);

Now we are ready to load the cad model of the object. ViSP supports cad model in cao format or in vrml format. The cao format is a particular format only supported by ViSP. It doesn't require an additional 3rd party rather then vrml format that require Coin 3rd party. We load the cad model in cao format from teabox.cao file which complete description is provided in teabox.cao example with:

tracker.loadModel(objectname + ".cao");

It is also possible to modify the code to load the cad model in vrml format from teabox.wrl file described in teabox.wrl example. To this end modify the previous line with:

tracker->loadModel(objectname + ".wrl");

Once the model of the object to track is loaded, with the next line the display in the image window of additional drawings in overlay such as the moving edges positions, is then enabled by:

tracker.setDisplayFeatures(true);

Now we have to initialize the tracker. With the next line we choose to use a user interaction (see Initialization by user click).

tracker.initClick(I, objectname + ".init", true);

Next, in the infinite while loop, after displaying the next image, we track the object on a new image I.

tracker.track(I);

The result of the tracking is a pose cMo that can be obtained by:

tracker.getPose(cMo);

Next lines are used first to retrieve the camera parameters used by the tracker, then to display the visible part of the cad model using red lines with 2 as thickness, and finally to display the object frame at the estimated position cMo. Each axis of the frame are 0.025 meters long. Using vpColor::none indicates that x-axis is displayed in red, y-axis in green, while z-axis in blue. The thickness of the axis is 3.

tracker.getCameraParameters(cam);

tracker.display(I, cMo, cam, vpColor::red, 2);

The last lines are here to free the memory allocated for the display and tracker:

delete display;

Tracker CAD model

ViSP model-based tracker supports two different ways to describe CAD models, either in cao or in vrml format.

cao format is specific to ViSP. It allows to describe the CAD model of an object using a text file with extension .cao.
vrml format is supported only if Coin 3rd party is installed. This format allows to describe the CAD model of an object using a text file with extension .wrl.

To load a CAD model there is the vpMbGenericTracker::loadModel() function that could be used to load either cao model:

vpMbGenericTracker tracker;

tracker.loadModel("teabox.cao");

or a vrml model:

tracker.loadModel("teabox.wrl");

teabox.cao example

The content of the file teabox.cao used in the getting started Example source code but also in tutorial-mb-edge-tracker.cpp and in tutorial-mb-hybrid-tracker.cpp examples is given here:

 V1
 # 3D Points
 8                  # Number of points
 0     0      0     # Point 0: X Y Z
 0     0     -0.08
 0.165 0     -0.08
 0.165 0      0
 0.165 0.068  0
 0.165 0.068 -0.08
 0     0.068 -0.08
 0     0.068  0     # Point 7
 # 3D Lines
 0                  # Number of lines
 # Faces from 3D lines
 0                  # Number of faces
 # Faces from 3D points
 6                  # Number of faces
 4 0 1 2 3          # Face 0: [number of points] [index of the 3D points]...
 4 1 6 5 2
 4 4 5 6 7
 4 0 3 4 7
 4 5 4 3 2
 4 0 7 6 1          # Face 5
 # 3D cylinders
 0                  # Number of cylinders
 # 3D circles
 0                  # Number of circles

This file describes the model of the tea box corresponding to the next image:

Index of the vertices used to model the tea box in cao format.

We make the choice to describe the faces of the box from the 3D points that correspond to the vertices. We provide now a line by line description of the file. Notice that the characters after the '#' are considered as comments.

line 1: Header of the .cao file
line 3: The model is defined by 8 3D points. Here the 8 points correspond to the 8 vertices of the tea box presented in the previous figure. Thus, next 8 lines define the 3D points coordinates.
line 4: 3D point with coordinate (0,0,0) corresponding to vertex 0 of the tea box. This point is also the origin of the frame in which all the 3D points are defined.
line 5: 3D point with coordinate (0,0,-0.08) corresponding to vertex 1.
line 6 to 11: The other 3D points corresponding to vertices 2 to 7 respectively.
line 13: Number of 3D lines defined from two 3D points. It is possible to introduce 3D lines and then use these lines to define faces from these 3D lines. This is particularly useful to define faces from non-closed polygons. For instance, it can be used to specify the tracking of only 3 edges of a rectangle. Notice also that a 3D line that doesn't belong to a face is always visible and consequently always tracked.
line 15: Number of faces defined from 3D lines. In our teabox example we decide to define all the faces from 3D points, that is why this value is set to 0.
line 17: The number of faces defined by a set of 3D points. Here our teabox has 6 faces. Thus, next 6 lines describe each face from the 3D points defined previously line 4 to 11. Notice here that all the faces defined from 3D points corresponds to closed polygons.
line 18: First face defined by 4 3D points, respectively vertices 0,1,2,3. The orientation of the face is counter clockwise by going from vertex 0 to vertex 1, then 2 and 3. This fixes the orientation of the normal of the face going outside the object.
line 19: Second face also defined by 4 points, respectively vertices 1,6,5,2 to have a counter clockwise orientation.
line 20 to 23: The four other faces of the box.
line 25: Number of 3D cylinders describing the model. Since we model a simple box, the number of cylinders is 0.
line 27: Number of 3D circles describing the model. For the same reason, the number of circles is 0.

teabox-triangle.cao example

The content of the file teabox-triangle.cao used in the tutorial-mb-klt-tracker.cpp example is given here:

 V1
 # 3D Points
 8                  # Number of points
 0     0      0     # Point 0: X Y Z
 0     0     -0.08
 0.165 0     -0.08
 0.165 0      0
 0.165 0.068  0
 0.165 0.068 -0.08
 0     0.068 -0.08
 0     0.068  0     # Point 7
 # 3D Lines
 0                  # Number of lines 
 # Faces from 3D lines
 0                  # Number of faces
 # Faces from 3D points
 12                 # Number of faces
 3 0 1 2            # Face 0: [number of points] [index of the 3D points]...
 3 0 2 3
 3 0 3 7
 3 3 4 7
 3 4 5 6 
 3 4 6 7
 3 1 6 5 
 3 1 5 2
 3 5 3 2
 3 5 4 3
 3 7 6 1
 3 7 1 0            # Face 11
 # 3D cylinders
 0                  # Number of cylinders
 # 3D circles
 0                  # Number of circles

This file describes the model of the tea box corresponding to the next image:

Index of the vertices used to model the tea box in cao format with triangles.

Until line 15, the content of this file is similar to the one described in teabox.cao example. Line 17 we specify that the model contains 12 faces. Each face is then described as a triangle.

Note: Since some lines of the model (for example the one between points 0 and 2, or 7 and 3...) don't correspond to teabox edges, this CAD model is not suited for moving-edges and hybrid trackers.

teabox.wrl example

The content of the teabox.wrl file used in tutorial-mb-generic-tracker-full.cpp and tutorial-mb-edge-tracker.cpp when teabox.cao is missing is given hereafter. This content is to make into relation with teabox.cao described in teabox.cao example. As for the cao format, teabox.wrl describes first the vertices of the box, then the edges that corresponds to the faces.

 #VRML V2.0 utf8
 
 DEF fst_0 Group {
 children [
 
 # Object "teabox"
 Shape {
 
 geometry DEF cube IndexedFaceSet {
 
 coord Coordinate { 
 point [
 0     0      0   ,
 0     0     -0.08,
 0.165 0     -0.08,
 0.165 0      0   ,
 0.165 0.068  0   ,
 0.165 0.068 -0.08,
 0     0.068 -0.08,
 0     0.068  0    ]
 }
 
 coordIndex [
  0,1,2,3,-1,
  1,6,5,2,-1,
  4,5,6,7,-1,
  0,3,4,7,-1,
  5,4,3,2,-1,
  0,7,6,1,-1]}
 }
 
 ]
 }

This file describes the model of the tea box corresponding to the next image:

Index of the vertices used to model the tea box in vrml format.

We provide now a line by line description of the file where the faces of the box are defined from the vertices:

line 1 to 10: Header of the .wrl file.
line 13 to 20: 3D coordinates of the 8 tea box vertices.
line 34 to 29: Each line describe a face. In this example, a face is defined by 4 vertices. For example, the first face join vertices 0,1,2,3. The orientation of the face is counter clockwise by going from vertex 0 to vertex 1, then 2 and 3. This fixes the orientation of the normal of the face going outside the object.

Tracker initialization

There are two ways to initialize the tracker, either by user interaction, either using an initial pose provided by a specific algorithm.

Initialization by user click

The tracker could be initialized by the user that has to click on at least 4 points on the object seen in the image. To this end, vpMbGenericTracker::initClick() function has to be used.

tracker.initClick(I, objectname + ".init", true);

The previous line of code, loads a file named "<objectname>.init" and waits for user click. When an image named "<objectname>.ppm" exists besides "<objectname>.init" and when the last parameter of the function is set to true, the image is displayed to help the user to know where to click. Supported image formats are .ppm, .pgm, .png, .jpeg and .jpg.

Let us consider the teabox example.

The user has to click in the image on four vertices with their 3D coordinates defined in the "teabox.init" file. The following image "teabox.ppm" shows where the user has to click.

Image "teabox.ppm" used to help the user to initialize the tracker.

Matched 2D and 3D coordinates are then used to compute an initial pose used to initialize the tracker. Note also that the third optional argument "true" is used here to enable the display of an image that may help the user for the initialization. The name of this image is the same as the "*.init" file except the extension that should be ".ppm". In our case it will be "teabox.ppm".

The content of teabox.init file that defines 3D coordinates of some points of the model used during user initialization is provided hereafter. Note that all the characters after character '#' are considered as comments.

 4                  # Number of points
 0     0      0     # Point 0
 0.165 0      0     # Point 3
 0.165 0     -0.08  # Point 2
 0.165 0.068 -0.08  # Point 5

We give now the signification of each line of this file:

line 1: Number of 3D points that are defined in this file. At least 4 points are required. Four is the minimal number of points requested to compute a pose.
line 2: Each point is defined by its 3D coordinates. Here we define the first point with coordinates (0,0,0). In the previous figure it corresponds to vertex 0 of the tea box. This point is also the origin of the frame in which all the points are defined.
line 3: 3D coordinates of vertex 3.
line 4: 3D coordinates of vertex 2.
line 5: 3D coordinates of vertex 5.

Here the user has to click on vertex 0, 3, 2 and 5 in the window that displays image I. From the 3D coordinates defined in teabox.init and the corresponding 2D coordinates of the vertices obtained by user interaction a pose is computed that is then used to initialize the tracker.

How to choose good points for manual initialization

To select which 3D points to put in the .init file that are used to initialize manually the tracker, you should ensure that:

The projection of the points (usually we use four 3D points but could be more in the .init file) in the image must be visible
The spatial distribution of the projection of 3D points in the image should be as wide as possible in the image (ie they should not be distributed over a very small part in the image), they should be not aligned and not located in the same plane when the object is non planar
Usually, we copy/paste coordinates of 3D points from .cao file.

Initialization by external pose

The other way to initialize the tracker is to use an initial pose provided by an external algorithm. To this end, vpMbGenericTracker::initFromPose() function has to be used.

tracker.initFromPose(I, cMo);

Initial pose named cMo is here a vpHomogeneousMatrix object.

There are several ways to get an initial pose:

by using a fiducal marker such an AprilTag that could be detected online and which pose can serve as initialization; see Tutorial: Markerless generic model-based tracking using AprilTag for initialization (use case)
when the object is textured, by learning the keypoint descriptors located on visible faces; see Tutorial: Object detection and localization
by using advanced deep learning algorithms...

Tracker settings

Settings from an XML file

Instead of setting the tracker parameters from source code, it is possible to specify the settings from an XML file. As done in tutorial-mb-generic-tracker-full.cpp example, to read the parameters from an XML file, simply modify the code like:

    if (vpIoTools::checkFilename(objectname + ".xml")) {
      tracker.loadConfigFile(objectname + ".xml");
      usexml = true;
    }

The content of the XML file teabox.xml that is considered by default is the following:

<?xml version="1.0"?>
<conf>
  <ecm>
    <mask>
      <size>5</size>
      <nb_mask>180</nb_mask>
    </mask>
    <range>
      <tracking>8</tracking>
    </range>
    <contrast>
      <edge_threshold>10000</edge_threshold>
      <mu1>0.5</mu1>
      <mu2>0.5</mu2>
    </contrast>
    <sample>
      <step>4</step>
    </sample>
  </ecm>
  <klt>
    <mask_border>5</mask_border> 
    <max_features>300</max_features> 
    <window_size>5</window_size> 
    <quality>0.015</quality> 
    <min_distance>8</min_distance> 
    <harris>0.01</harris>
    <size_block>3</size_block> 
    <pyramid_lvl>3</pyramid_lvl> 
  </klt>
  <camera>
    <u0>325.66776</u0> 
    <v0>243.69727</v0> 
    <px>839.21470</px> 
    <py>839.44555</py> 
  </camera>
  <face>
    <angle_appear>70</angle_appear> 
    <angle_disappear>80</angle_disappear> 
    <near_clipping>0.1</near_clipping>
    <far_clipping>100</far_clipping>
    <fov_clipping>1</fov_clipping>
  </face>
</conf>

Depending on the visual features that are used all the XML tags are not useful:

<ecm> tag corresponds to the moving-edges settings.
<klt> tag corresponds to the keypoint visual features and especially the KLT tracker settings used to detect and track the keypoints.
<camera> tag is used to define the camera intrinsic parameters
<face> tag is used by the visibility algorithm used to determine if a face of the object is visible or not.

Moving-edges settings

Moving-edges settings affect the way the visible edges of an object are tracked. These settings could be tuned either from XML using <ecm> tag as:

<conf>
  ...
  <ecm>
    <mask>
      <size>5</size>
      <nb_mask>180</nb_mask>
    </mask>
    <range>
      <tracking>8</tracking>
    </range>
    <contrast>
      <edge_threshold>10000</edge_threshold>
      <mu1>0.5</mu1>
      <mu2>0.5</mu2>
    </contrast>
    <sample>
      <step>4</step>
    </sample>
  </ecm>
  ...
</conf>

of from source code using vpMbGenericTracker::setMovingEdge() method:

        vpMe me;
        me.setMaskSize(5);
        me.setMaskNumber(180);
        me.setRange(8);
        me.setThreshold(10000);
        me.setMu1(0.5);
        me.setMu2(0.5);
        me.setSampleStep(4);
        tracker.setMovingEdge(me);

Either from xml or from the previous source code you can set:

mask_size: defines the size of the convolution mask used to detect an edge.
nb_mask: number of mask applied to determine the object contour. The number of mask determines the precision of the normal of the edge for every sample. If precision is 2deg, then there are 360/2 = 180 masks.
range_tracking: range on both sides of the reference pixel along the normal of the contour used to track a moving-edge. If the displacement of the tracked object in two successive images is large, you have to increase this parameter.
edge_threshold: likelihood threshold used to determined if the moving edge is valid or not.
mu1: minimum image contrast allowed to detect a contour.
mu2: maximum image contrast allowed to detect a contour.
sample_step: minimum distance in pixel between two discretized moving-edges. To increase the number of moving-edges you have to reduce this parameter.

Note: Most important parameters are range_tracking and sample_step.

Keypoints settings

Keypoint settings affect tracking of keypoints on visible faces using KLT. These settings could be tuned either from XML using <klt> tag as:

<conf>
  ...
  <klt>
    <mask_border>5</mask_border> 
    <max_features>300</max_features> 
    <window_size>5</window_size> 
    <quality>0.015</quality> 
    <min_distance>8</min_distance> 
    <harris>0.01</harris>
    <size_block>3</size_block> 
    <pyramid_lvl>3</pyramid_lvl> 
  </klt>
  ...
</conf>

of from source code using vpMbKltTracker::setKltOpencv() and vpMbKltTracker::setMaskBorder() methods:

        vpKltOpencv klt_settings;
        klt_settings.setMaxFeatures(300);
        klt_settings.setWindowSize(5);
        klt_settings.setQuality(0.015);
        klt_settings.setMinDistance(8);
        klt_settings.setHarrisFreeParameter(0.01);
        klt_settings.setBlockSize(3);
        klt_settings.setPyramidLevels(3);
        tracker.setKltOpencv(klt_settings);
        tracker.setKltMaskBorder(5);

With the previous parameters you can set:

mask_border: erosion number of the mask used on each face.
max_features: maximum number of keypoint features to track in the image.
window_size: window size used to refine the corner locations.
quality: parameter characterizing the minimal accepted quality of image corners. Corners with quality measure less than this parameter are rejected. This means that if you want to have more keypoints on a face, you have to reduce this parameter.
min_distance: minimal Euclidean distance between detected corners during keypoint detection stage used to initialize keypoint location.
harris: free parameter of the Harris detector.
size_block: size of the averaging block used to track the keypoint features.
pyramid_lvl: maximal pyramid level. If the level is zero, then no pyramid is computed for the optical flow.

Note: Most important parameters are min_distance and quality.

Camera settings

Camera settings correspond to the intrinsic camera parameters without distortion. If images are acquired by a camera that has a large field of view that introduce distortion, images need to be undistorded before processed by the tracker. The camera parameters are then the one obtained on undistorded images.

Camera settings could be set from XML using <camera> tag as:

<conf>
  ...
  <camera>
    <u0>325.66776</u0> 
    <v0>243.69727</v0> 
    <px>839.21470</px> 
    <py>839.44555</py> 
  </camera>
  ...
</conf>

of from source code using vpMbTracker::setCameraParameters() method:

cam.initPersProjWithoutDistortion(839.21470, 839.44555, 325.66776, 243.69727);

tracker.setCameraParameters(cam);

As described in vpCameraParameters class, these parameters correspond to $(p_x, p_y)$ the ratio between the focal length and the size of a pixel, and $(u_0, v_0)$ the coordinates of the principal point in pixel.

Note: The Tutorial: Camera intrinsic calibration explains how to obtain these parameters from a camera calibration stage.

Visibility settings

An important setting concerns the visibility test that is used to determine if a face is visible or not. Note that moving-edges and keypoints are only tracked on visible faces. Three different visibility tests are implemented; with or without Ogre ray tracing and with or without scanline rendering. The default test is the one without Ogre and scanline. The functions vpMbTracker::setOgreVisibilityTest() and vpMbTracker::setScanLineVisibilityTest() allow to select which test to use.

Default visibility based on normals

Let us now highlight how the default visibility test works. As illustrated in the following figure, the angle $\alpha$ between the normal of the face and the line going from the camera to the center of gravity of the face is used to determine if the face is visible.

Principle of the visibility test used to determine if a face is visible.

When no advanced visibility test is enable (we recall that this is the default behavior), the algorithm that computes the normal of the face is very simple. It makes the assumption that faces are convex and oriented counter clockwise. If we consider two parameters; the angle to determine if a face is appearing $\alpha_{appear}$ , and the angle to determine if the face is disappearing $\alpha_{disappear}$ , a face will be considered as visible if $\alpha \leq \alpha_{disappear}$ . We consider also that a new face is appearing if $\alpha \leq \alpha_{appear}$ . These two parameters can be set either in the XML file:

<conf>
  ...
  <face>
    <angle_appear>70</angle_appear> 
    <angle_disappear>80</angle_disappear> 
  </face>

or in the code:

tracker.setAngleAppear(vpMath::rad(70));

tracker.setAngleDisappear(vpMath::rad(80));

Here the face is considered as appearing if $\alpha \leq 70$ degrees, and disappearing if $\alpha > 80$ degrees.

Note: When these two angle parameters are not set, their default values set to 89 degrees are used.

Advanced visibility via Ogre3D

The Ogre3D visibility test approach is based on ray tracing. When this test is enabled, the algorithm used to determine the visibility of a face performs (in addition to the previous test based on normals, i.e on the visible faces resulting from the previous test) another test which sets the faces that are partially occluded as non-visible. It can be enabled via:

tracker->setOgreVisibilityTest(true);

Ogre visibility test on both polygons.

When using the classical version of the ogre visibility test (which is the default behavior when activating this test), only one ray is used per polygon to test its visibility. As shown on the figure above, this only ray is sent from the camera to the center of gravity of the considered polygon. If the ray intersects another polygon before the considered one, it is set as non-visible. Intersections are computed between the ray and the axis-aligned bounding-box (AABB) of each polygon. In the figure above, the ray associated to the first polygon intersects first the AABB of the second polygon so it is considered as occluded. As a result, only the second polygon will be used during the tracking phase. This means that when using the edges, only the blue lines will be taken into account, and when using the keypoints, they will be detected only inside the second polygon (blue area).

Additionally, it is also possible to use a statistical approach during the ray tracing phase in order to improve the visibility results.

tracker->setNbRayCastingAttemptsForVisibility(4);

tracker->setGoodNbRayCastingAttemptsRatio(0.70);

Ogre visibility test on the first polygon, using a statistical approach.

Contrary to the classical version of this test, the statistical approach uses multiple rays per polygons (4 in the example above). Each ray is sent randomly toward the considered polygon. If a specified ratio of rays do not have intersected another polygon before the considered one, the polygon is set as visible. In the example above, three ray on four return the first polygon as visible. As the ratio of good matches is more than 70% (which corresponds to the chosen ratio in this example) the first polygon is considered as visible, as well as the second one. As a result, all visible blue lines will be taken into account during the tracking phase of the edges and the keypoints that are detected inside the green area will be also used. Unfortunately, this approach is a polygon based approach so the dashed blue lines, that are not visible, will also be used during the tracking phase. Plus, keypoints that are detected inside the overlapping area won't be well associated and can disturb the algorithm.

Note: Since ViSP 3.0.0 we have introduced vpMbTracker::setOgreShowConfigDialog() method that allows to open the Ogre configuration panel which can be used to select the renderer. To enable this feature, use:

tracker->setOgreShowConfigDialog(true);

Advanced visibility via Scanline Rendering

Contrary to the visibility test using Ogre3D, this method does not require any additional third-party library. As for the advanced visibility using Ogre3D, this test is applied in addition to the test based on normals (i.e on the faces set as visible during this test) and also in addition to the test with Ogre3D if it has been activated. This test is based on the scanline rendering algorithm and can be enabled via:

tracker->setScanLineVisibilityTest(true);

Scanline visibility test on both polygons.

Even if this approach requires a bit more computing power, it is a pixel perfect visibility test. According to the camera point of view, polygons will be decomposed in order to consider only their visible parts. As a result, if we consider the example above, dashed red lines won't be considered during the tracking phase and detected keypoints will be correctly matched with the closest (in term of depth from the camera position) polygon.

Clipping settings

Additionally to the visibility test described above, it is also possible to use clipping. Firstly, the algorithm removes the faces that are not visible, according to the visibility test used, then it will also remove the faces or parts of the faces that are out of the clipping planes. As illustrated in the following figure, different clipping planes can be enabled.

Camera field of view and clipping planes.

Let us consider two plane categories: the ones belonging to the field of view or FOV (Left, Right, Up and Down), and the Near and Far clipping planes. The FOV planes can be enabled by:

tracker.setClipping(tracker.getClipping() | vpMbtPolygon::FOV_CLIPPING);

which is equivalent to:

tracker->setClipping(vpMbtPolygon::LEFT_CLIPPING 
                  | vpMbtPolygon::RIGHT_CLIPPING
                  | vpMbtPolygon::UP_CLIPPING 
                  | vpMbtPolygon::DOWN_CLIPPING);

Of course, if the user just wants to activate Right and Up clipping, he will use:

tracker->setClipping(vpMbtPolygon::RIGHT_CLIPPING | vpMbtPolygon::UP_CLIPPING);

For the Near and Far clipping it is quite different. Indeed, those planes require clipping distances. Here there are two choices, either the user uses default values and activate them with:

tracker->setClipping(vpMbtPolygon::NEAR_CLIPPING | vpMbtPolygon::FAR_CLIPPING);

or the user can specify the distances in meters, which will automatically activate the clipping for those planes:

tracker.setNearClippingDistance(0.1);

tracker.setFarClippingDistance(100.0);

It is also possible to enable them in the XML file. This is done with the following lines:

<conf>
  ...
  <face>
    ...
    <near_clipping>0.1</near_clipping>
    <far_clipping>100.0</far_clipping>
    <fov_clipping>0</fov_clipping>
  </face>

Here for simplicity, the user just has the possibility to either activate all the FOV clipping planes or none of them (fov_clipping requires a boolean).

Note: When clipping parameters are not set in the XML file, nor in the code, clipping is not used. Usually clipping is not helpful when the object to track is simple.

Advanced

How to detect tracking failures

The first way to detect a tracking failure is to catch potential internal exceptions returned by the tracker:

vpMbGenericTracker tracker;
...
while (! end)
{
  bool tracking_failed = false;
  ...
  try {
    tracker.track(I);
  } catch (const vpException &e) {
    std::cout << "Tracker exception: " << e.getStringMessage() << std::endl;
    tracking_failed = true;
  }
  ...
}

If you are using edges as features, you can exploit the internal tracker state using vpMbTracker::getProjectionError() to get a scalar criteria between 0 and 90 degrees corresponding to the cad model projection error. This criteria corresponds to the mean angle between the gradient direction of the moving-edges features that are tracked and the normal of the projected cad model. Thresholding this scalar allows to detect a tracking failure. Usually we consider that a projection angle higher to 25 degrees corresponds to a tracking failure. This threshold needs to be adapted to your setup and illumination conditions.

...
while (! end)
{
  ...
  if (! tracking_failed) {
    double proj_error = 0;
    if (tracker.getTrackerType() & vpMbGenericTracker::EDGE_TRACKER) {
      proj_error = tracker.getProjectionError();
    }
    if (proj_error > 25) {
      std::cout << "Tracker needs to restart (projection error detected: " << proj_error << ")" << std::endl;
      tracking_failed = true;
    }
  }
  ...
}

When edges are not tracked, meaning that your tracker uses rather klt keypoints or depth features, there is vpMbGenericTracker::computeCurrentProjectionError() function that may be useful.

...
while (! end)
{
  ...
  if (! tracking_failed) {
    double proj_error = 0;
    if (tracker.getTrackerType() & vpMbGenericTracker::EDGE_TRACKER) {
      proj_error = tracker.getProjectionError();
    }
    else {
      tracker.getPose(cMo);
      tracker.getCameraParameters(cam);
      proj_error = tracker.computeCurrentProjectionError(I, cMo, cam);
    }
    if (proj_error > 25) {
      std::cout << "Tracker needs to restart (projection error detected: " << proj_error << ")" << std::endl;
      tracking_failed = true;
    }
  }
  ...
}

Note: The function vpMbTracker::getProjectionError() is able to compute the projection error only from moving edges that are located on visible faces, while vpMbGenericTracker::computeCurrentProjectionError() is not able to distinguish between visible & non visible faces. Thus, results may be more precise with vpMbTracker::getProjectionError(). The tracker allows to display gradient and model orientation when computing the projection error. To this end use the following:
vpMbGenericTracker tracker;
tracker.setProjectionErrorDisplay(true);
...
while (! end)
{
...
}

Tracking failure detection is used in tutorial-mb-generic-tracker-live.cpp and tutorial-mb-generic-tracker-rgbd-realsense.cpp examples.

The model-based tracker can also update a covariance matrix corresponding to the estimated pose. But from our experience, analysing the diagonal of the 6 by 6 covariance matrix doesn't allow to detect a tracking failure. If you want to have a trial, we recall the way to get the covariance matrix:

vpMbGenericTracker tracker;
tracker.setCovarianceComputation(true);
...
while (! end)
{
  bool tracking_failed = false;
  ...
  try {
    tracker.track(I);
  } catch (const vpException &e) {
    std::cout << "Tracker exception: " << e.getStringMessage() << std::endl;
    tracking_failed = true;
  }
  ...
  vpMatrix covariance = tracker.getCovarianceMatrix();
}

How to manipulate the model

The following code shows how to access to the CAD model

to check if a face is visible,
to get the name of the face (only with models in .cao format for the moment)
to check if the level of detail is enable/disable (only with models in .cao format for the moment)
to access to the coordinates of the 3D points used to model a face
from the pose cMo estimated by the tracker to compute the coordinates of the 3D points in the image

vpMbHiddenFaces<vpMbtPolygon> &faces = tracker.getFaces();
std::cout << "Number of faces: " << faces.size() << std::endl;
for (unsigned int i=0; i < faces.size(); i++) {
  std::vector<vpMbtPolygon*> &poly = faces.getPolygon();
  std::cout << "face " << i << " with index: " << poly[i]->getIndex()
      << (poly[i]->getName().empty() ? "" : (" with name: " + poly[i]->getName()))
      << " is " << (poly[i]->isVisible() ? "visible" : "not visible")
      << " and has " << poly[i]->getNbPoint() << " points" 
      << " and LOD is " << (poly[i]->useLod ? "enabled" : "disabled") << std::endl;
      
  for (unsigned int j=0; j<poly[i]->getNbPoint(); j++) {
    vpPoint P = poly[i]->getPoint(j);
    P.project(cMo);
    std::cout << " P obj " << j << ": " << P.get_oX() << " " << P.get_oY() << " " << P.get_oZ() << std::endl;
    std::cout << " P cam" << j << ": " << P.get_X() << " " << P.get_Y() << " " << P.get_Z() << std::endl;
    vpImagePoint iP;
    vpMeterPixelConversion::convertPoint(cam, P.get_x(), P.get_y(), iP);
    std::cout << " iP " << j << ": " << iP.get_u() << " " << iP.get_v() << std::endl;
  }
}

Level of detail (LOD)

The level of detail (LOD) consists in introducing additional constraints to the visibility check to determine if the features of a face have to be tracked or not. Two parameters are used

the line length (in pixel)
the area of the face (in pixel²), that could be closed or not (you can define an open face by adding all the segments without the last one which closes the face)

The tracker allows to enable/disable the level of detail concept using vpMbTracker::setLod() function. This example permits to set LOD settings to all elements :

tracker.setLod(true);
tracker.setMinLineLengthThresh(40.0);
tracker.setMinPolygonAreaThresh(500.0);

This example permits to set LOD settings to specific elements denominated by his name.

tracker.setLod(false);
tracker.setLod(true, "Left line");
tracker.setLod(true, "Front face");
tracker.setMinLineLengthThresh(35.0, "Left line");
tracker.setMinPolygonAreaThresh(120.0, "Front face");

Furthermore, to set a name to a face see How to set a name to a face.

CAD model in cao format

Note: You may be interested to look at Tutorial: Markerless model-based tracker CAD model editor - GSoC 2017 project that will present some useful tools to handle more conveniently the custom .cao model file:

one Blender plugin to export a classical CAD model (Collada, Wavefront, Stl, ...) in the ViSP .cao format
one Blender plugin to import the ViSP .cao format into Blender
a Qt-based application to edit and view a .cao model file to check if the modeling is correct for instance

How to model faces from lines

The first thing to do is to declare the differents points. Then you define each segment of the face with the index of the start point and with the index of the end point. Finally, you define the face with the index of the segments which constitute the face.

Note: The way you declare the face segments (clockwise or counter clockwise) will determine the direction of the normal of the face and so will influe on the visibility of the face.

V1
# Left wing model
6                               # Number of points
# 3D points
-4     -3.8     0.7
-6     -8.8     0.2
-12   -21.7    -1.2
-9    -21.7    -1.2
 0.8   -8.8     0.2
 4.6   -3.8     0.7
# 3D lines
6                               # Number of lines
0 1                             # line 0
1 2
2 3
3 4
4 5
5 0                             # line 5
# Faces from 3D lines
1                               # Number of faces defined by lines
6 0 1 2 3 4 5                   # face 0: [number of lines] [index of the lines]...
# Faces from 3D points
0
# 3D cylinders
0
# 3D circles
0

How to model cylinders

The first thing to do is to declare the two points defining the cylinder axis of revolution. Then you declare the cylinder with the index of the points that define the cylinder axis of revolution and with the cylinder radius.

Note: For the level of detail, in a case of a cylinder, this is taking into account by using the length of the axis of revolution to determine the visibility.

Example of a cylinder.

V1
# Cylinder model
2                 # Number of points
# 3D points
16.9 0 0.5        # point 0
-20  0 0.5        # point 1
# 3D lines
0
# Faces from 3D lines
0
# Faces from 3D points
0
# 3D cylinders
1                 # Number of cylinders
0 1 2.4           # cylinder 0: [1st point on revolution axis] [2nd point on revolution axis] [radius]
# 3D circles
0

How to model circles

The first thing to do is to declare three points: one point for the center of the circle and two points on the circle plane (i.e. not necessary located on the perimeter of the circle but on the plane of the circle). Then you declare your circle with the radius and with index of the three points.

Note: The way you declare the two points on the circle plane (clockwise or counter clockwise) will determine the direction of the normal of the circle and so will influe on the visibility of the circle. For the level of detail, in a case of a circle, this is taking into account by using the area of the bounding box of the circle to determine the visibility.

Example of a circle.

V1
# Circle model
3                    # Number of points
# 3D points
-3.4    14.6    1.1  # point 0
-3.4    15.4    1.1
-3.4    14.6    1.8  # point 2
# 3D lines
0
# Faces from 3D lines
0
# Faces from 3D points
0
# 3D cylinders
0
# 3D circles
1                    # Number of circles
0.8 0 2 1            # circle 0: [radius] [circle center] [1st point on circle plane] [2nd point on circle plane]

How to change cao model origin

By default, the cMo pose estimated by the tracker corresponds to the homogeneous transformation between the camera frame and the CAD model object frame. The tracker can consider an optional transformation matrix (currently only for .cao) to transform 3D points of the CAD model expressed in the original object frame to a desired object frame. Let us call this matrix oMod. When this matrix is introduced, the tracker estimates the homogeneous transformation between the camera and the modified CAD model object frame.

The tracker has the ability to modify the location of the CAD model origin frame, introducing an extra homogeneous transformation. was designed to easily modify the location of the CAD model origin introducing an offset transformation matrix:

To load a CAD model there is the vpMbGenericTracker::loadModel() function that could be used to load either cao model:

vpMbGenericTracker tracker;
vpHomogeneousMatrix oMd;
...
tracker.loadModel("teabox.cao", false, oMod);

How to create a hierarchical model

It could be useful to define a complex model instead of using one big model file with all the declaration with different sub-models, each one representing a specific part of the complex model in a specific model file. To create a hierarchical model, the first step is to define all the elementary parts and then regroup them.

Example of a possible hierarchical modeling of a plane.

For example, if we want to have a model of a plane, we could represent as elementary parts the left and right wings, the tailplane (which is constituted of some other parts) and a cylinder for the plane fuselage. The following lines represent the top model of the plane.

V1
# header
# load the different parts of the plane
load("wings.cao")       # load the left and right wings
load("tailplane.cao")
# 3D points
2                       # Number of points
16.9 0 0.5
-20  0 0.5
# 3D lines
0
# Faces from 3D lines
0
# Faces from 3D points
0
# 3D cylinders
1                       # Number of cylinders
0 1 2.4                 # define the plane fuselage as a cylinder
# 3D circles
0

Note: The path to include another model can be expressed as an absolute or a relative path (relative to the file which includes the model).

How to set a name to a face

To exploit the name of a face in the code, see sections about Level of detail (LOD) and How not to consider specific polygons.

It could be useful to give a name for a face in a model in order to easily modify his LOD parameters for example, or to decide if you want to use this face or not during the tracking phase. This is done directly in the .cao model file. For example, the next example shows how to set plane_fuselage as name for the cylinder used to model the plane fuselage and right reactor as name for the corresponding plane reactor modeled as a circle:

V1
# header
# load the different parts of the plane
load("wings.cao")
load("tailplane.cao")
# 3D points
5                                    # Number of points
16.9    0   0.5
-20     0   0.5
-3.4    14.6    1.1
-3.4    15.4    1.1
-3.4    14.6    1.8
# 3D lines
0
# Faces from 3D lines
0
# Faces from 3D points
0
# 3D cylinders
1                                    # Number of cylinders
0 1 2.4     name=plane_fuselage
# 3D circles
1                                    # Number of circles
0.8 2 4 3   name="right reactor"

Note: If the name contains space characters, it must be surrounded by quotes. You can give a name to all the elements excepts for points.

How to load cao model with transformation

In ViSP 3.2.1 we introduce the capability to load a .cao model with 3D translation and rotation. For the translation, values are expressed in meters. For the rotation, the representation is the $\theta_u$ axis-angle implemented in vpThetaUVector class. Values can be expressed in degrees or radians.

Let us model a square made with 4 lego 2x4 bricks as shown in the next immage.

Instead of modeling all the 4 brick, we can just model one brick and use translation and rotation to model the others.

The .cao model of the 2x4 brick located in ./lego_parts/brick-2x4.cao is the following:

V1
#################################################################
# CAD model in cao format of a LEGO Element ID: 4211385 2x4 brick
# 2x4 brick size is 0.032m/0.016m/0.0096m
#################################################################
# 3D points
8                    # Number of points (here 8 brick corners)
0.000  0.000  0.0000 # Point 0: front face bottom/left
0.032  0.000  0.0000 # Point 1: front face bottom/right
0.032  0.000  0.0096 # Point 2: front face top/right
0.000  0.000  0.0096 # Point 3: front face top/left
0.000  0.016  0.0000 # Point 4: rear face bottom/left
0.032  0.016  0.0000 # Point 5: rear face bottom/right
0.032  0.016  0.0096 # Point 6: rear face top/right
0.000  0.016  0.0096 # Point 7: rear face top/left
# 3D lines
0                    # No 3D lines
# 3D faces from lines
0                    # No 3D faces from lines
# 3D faces from points
6                    # 6 faces to describe the brick
4 0 1 2 3            # Face 0: front face
4 3 2 6 7            # Face 1: top face
4 7 6 5 4            # Face 2: rear face
4 0 4 5 1            # Face 3: bottom face
4 0 3 7 4            # Face 4: left face
4 1 5 6 2            # Face 5: right face
# 3D cylinders
0
# 3D circle
0                    # No 3D circle

Now to model a square made with 4 bricks, we can reuse the same 2x4 brick model introducing translation and rotation. The corresponding model ./lego-square.cao is the following:

V1
################################################
# Construct a square made with 4 2x4 lego bricks
# 2x4 brick size is 0.032m/0.016m/0.0096m
################################################
# lego 1: the white in the previous image
load("lego_parts/brick-2x4.cao")
# lego 2: the yellow one
load("lego_parts/brick-2x4.cao", t=[0.048; 0.0; 0.0], tu=[0; 0; 90 deg]) # 0.048=0.032+0.016
# lego 3: the gray one
load("lego_parts/brick-2x4.cao", t=[0.016; 0.032; 0.0])
# lego 4: the blue one
load("lego_parts/brick-2x4.cao", t=[0.016; 0.016; 0.0], tu=[0; 0; 1.57 rad])
###############################################
# 3D points
0                    # No 3D points
# 3D lines
0                    # No 3D lines
# 3D faces from lines
0                    # No 3D faces from lines
# 3D faces from points
0                    # No 3D faces from points
# 3D cylinders
0                    # No 3D cylinders
# 3D circle
0                    # No 3D circle

The corresponding ./lego-square.init file could contain the following coordinates of the points

4
000  0.000  0.0096 # Point 1 corresponding to lego 1 front face top/left
032  0.000  0.0096 # Point 2 corresponding to lego 2 front face top/left
032  0.000  0.0000 # Point 3 corresponding to lego 2 front face bottom/left
032  0.016  0.0000 # Point 4 corresponding to lego 2 front face bottom/right

Now to see how to track this object you can jump to Test tracker on lego model with a live camera section.

How to tune the level of detail

As explained in section Level of detail (LOD) the parameters of the lod can be set in the source code. They can also be set directly in the configuration file or in the CAD model in cao format.

The following lines show the content of the configuration file :

<?xml version="1.0"?>
<conf>
  <lod>
    <use_lod>1</use_lod>
    <min_line_length_threshold>40</min_line_length_threshold>
    <min_polygon_area_threshold>150</min_polygon_area_threshold>
  </lod>
</conf>

In CAD model file, you can specify the LOD settings to the desired elements :

V1
# header
# load the different parts of the plane
load("wings.cao")
load("tailplane.cao")
# 3D points
5               # number of points
16.9    0   0.5
-20     0   0.5
-3.4    14.6    1.1
-3.4    15.4    1.1
-3.4    14.6    1.8
# 3D lines
0
# Faces from 3D lines
0
# Faces from 3D points
0
# 3D cylinders
1                               # Number of cylinders
0 1 2.4 name=plane_fuselage useLod=true minLineLengthThreshold=100.0
# 3D circles
1                               # Number of circles
0.8 2 4 3   name="right reactor" useLod=true minPolygonAreaThreshold=40.0

Note: The order you call the methods to load the configuration file and to load the CAD model in the code will modify the result of the LOD parameters. Basically, the LOD settings expressed in configuration file will have effect on all the elements in the CAD model while the LOD settings expressed in CAD model will be specific to an element. The natural order would be to load first the configuration file and after the CAD model.

CAD model in wrml format

How to set a name to a face

To exploit the name of a face in the code, see sections about Level of detail (LOD) and How not to consider specific polygons.

When using a wrml file, names are associated with shapes. In the example below (the model of a teabox), as only one shape is defined, all its faces will have the same name: "teabox_name".

Note: If you want to set different names for different faces, you have to define them in different shapes.

#VRML V2.0 utf8
DEF fst_0 Group {
children [
# Object "teabox"
DEF teabox_name Shape {
geometry DEF cube IndexedFaceSet {
coord Coordinate { 
point [
0     0      0   ,
0     0     -0.08,
0.165 0     -0.08,
0.165 0      0   ,
0.165 0.068  0   ,
0.165 0.068 -0.08,
0     0.068 -0.08,
0     0.068  0    ]
}
coordIndex [
 0,1,2,3,-1,
 1,6,5,2,-1,
 4,5,6,7,-1,
 0,3,4,7,-1,
 5,4,3,2,-1,
 0,7,6,1,-1]}
}
]
}

How not to consider specific polygons

When using model-based trackers, it is possible to not consider edge, keypoint or depth features tracking for specific faces. To do so, the faces you want to consider must have a name following How to set a name to a face.

If you want to enable (default behavior) or disable the edge tracking on specific face it can be done via:

vpMbGenericTracker::setUseEdgeTracking("name of the face", boolean);

If the boolean is set to False, the tracking of the edges of the faces that have the given name will be disable. If it is set to True (default behavior), it will be enable.

As for the edge tracking, the same functionality is also available when using keypoints via:

vpMbGenericTracker::setUseKltTracking("name of the face", boolean);

For the depth feature, this functionality is also available via:

vpMbGenericTracker::setUseDepthDenseTracking("name of the face", boolean);

or

vpMbGenericTracker::setUseDepthNormalTracking("name of the face", boolean);

How to save tracking results without vpDisplay

The classical way to save tracking results is to use vpDisplay::getImage() as in tutorial-export-image.cpp to exploit the display window in order to get the image and all the drawings in overlay in a RGBa image and then to save the resulting image.

But on embedded systems, it is not always possible to open a display window with vpDisplayX, vpDisplayOpenCV or vpDisplayGDI to view real-time tracking results. That's why we propose to use vpMbGenericTracker::getModelForDisplay() and vpMbGenericTracker::getFeaturesForDisplay() in conjunction with vpImageDraw to draw tracking results directly in an image that could be saved during tracking.

The following code from testGenericTracker.cpp shows how to draw the CAD model in an image named resultsColor or resultsDepth:

        std::map<std::string, std::vector<std::vector<double> > > mapOfModels;
        std::map<std::string, unsigned int> mapOfW;
        mapOfW["Camera1"] = I.getWidth();
        mapOfW["Camera2"] = I.getHeight();
        std::map<std::string, unsigned int> mapOfH;
        mapOfH["Camera1"] = I_depth.getWidth();
        mapOfH["Camera2"] = I_depth.getHeight();
        std::map<std::string, vpHomogeneousMatrix> mapOfcMos;
        mapOfcMos["Camera1"] = cMo;
        mapOfcMos["Camera2"] = depth_M_color*cMo;
        std::map<std::string, vpCameraParameters> mapOfCams;
        mapOfCams["Camera1"] = cam_color;
        mapOfCams["Camera2"] = cam_depth;
        tracker.getModelForDisplay(mapOfModels, mapOfW, mapOfH, mapOfcMos, mapOfCams);
        for (std::map<std::string, std::vector<std::vector<double> > >::const_iterator it = mapOfModels.begin();
             it != mapOfModels.end(); ++it) {
          for (size_t i = 0; i < it->second.size(); i++) {
            // test if it->second[i][0] = 0
            if (std::fabs(it->second[i][0]) <= std::numeric_limits<double>::epsilon()) {
              vpImageDraw::drawLine(it->first == "Camera1" ? resultsColor : resultsDepth, vpImagePoint(it->second[i][1], it->second[i][2]),
                                    vpImagePoint(it->second[i][3], it->second[i][4]), vpColor::red, 3);
            }
          }
        }

There is also the possibility to draw the tracked features in the same images:

        std::map<std::string, std::vector<std::vector<double> > > mapOfFeatures;
        tracker.getFeaturesForDisplay(mapOfFeatures);
        for (std::map<std::string, std::vector<std::vector<double> > >::const_iterator it = mapOfFeatures.begin();
             it != mapOfFeatures.end(); ++it) {
          for (size_t i = 0; i < it->second.size(); i++) {
            if (std::fabs(it->second[i][0]) <= std::numeric_limits<double>::epsilon()) { // test it->second[i][0] = 0 for ME
              vpColor color = vpColor::yellow;
              if (std::fabs(it->second[i][3]) <= std::numeric_limits<double>::epsilon()) { // test it->second[i][3] = 0
                color = vpColor::green;
              } else if (std::fabs(it->second[i][3] - 1) <= std::numeric_limits<double>::epsilon()) { // test it->second[i][3] = 1
                color = vpColor::blue;
              } else if (std::fabs(it->second[i][3] - 2) <= std::numeric_limits<double>::epsilon()) { // test it->second[i][3] = 2
                color = vpColor::purple;
              } else if (std::fabs(it->second[i][3] - 3) <= std::numeric_limits<double>::epsilon()) { // test it->second[i][3] = 3
                color = vpColor::red;
              } else if (std::fabs(it->second[i][3] - 4) <= std::numeric_limits<double>::epsilon()) { // test it->second[i][3] = 4
                color = vpColor::cyan;
              }
              vpImageDraw::drawCross(it->first == "Camera1" ? resultsColor : resultsDepth, vpImagePoint(it->second[i][1], it->second[i][2]),
                                     3, color, 1);
            } else if (std::fabs(it->second[i][0] - 1) <= std::numeric_limits<double>::epsilon()) { // test it->second[i][0] = 1 for KLT
              vpImageDraw::drawCross(it->first == "Camera1" ? resultsColor : resultsDepth, vpImagePoint(it->second[i][1], it->second[i][2]),
                                     10, vpColor::red, 1);
            }
          }
        }

And then to save the images in a file:

        char buffer[256];
        std::ostringstream oss;
        oss << "results/image_%04d.png";
        sprintf(buffer, oss.str().c_str(), cpt_frame);
        results.insert(resultsColor, vpImagePoint());
        results.insert(resultsDepth, vpImagePoint(0, resultsColor.getWidth()));
        vpImageIo::write(results, buffer);

Use case

Hereafter we provide the information to test the tracker with different objects.

Test tracker on teabox model

Enter model-based tracker tutorial build folder:

$ cd $VISP_WS/visp-build/tutorial/tracking/model-based/generic

There is tutorial-mb-generic-tracker-full binary that corresponds to the build of tutorial-mb-generic-tracker-full.cpp example. This example is an extension of tutorial-mb-generic-tracker.cpp that was explained in Getting started section.

to see the options that are available run:

$ ./tutorial-mb-generic-tracker-full --help

By default all the parameters are set to work with the teabox example. Just run the binary without option:

$ ./tutorial-mb-generic-tracker-full

You may also obtain the same results using:

$ ./tutorial-mb-generic-tracker-full --video model/teabox/teabox.mpg --model model/teabox/teabox.cao

Test tracker on CubeSAT satellite model

In http://visp-doc.inria.fr/download/mbt-model/cubesat.zip you will find the model data set (.obj, .cao, .init, .xml, .ppm) and a video to test the CubeSAT object tracking. After unzip in a folder (let say /your-path-to-model) you may run the tracker with something similar to:

$ cd $VISP_WS
$ wget http://visp-doc.inria.fr/download/mbt-model/cubesat.zip
$ unzip ~/Downloads/mmicro.zip
$ cd $VISP_WS/visp-build/tutorial/tracking/model-based/generic
$ ./tutorial-mb-generic-tracker-full --video $VISP_WS/cubesat/video/00%2d.png --model $VISP_WS/cubesat/cubesat1b.cao

You should be able to obtain these kind of results:

Test tracker on mmicro model

In http://visp-doc.inria.fr/download/mbt-model/mmicro.zip you will find the model data set (.cao, .wrl, .init, .xml, .ppm) and a video to track the mmicro object. After unzip in a folder (let say $VISP_WS) you may run the tracker with something similar to:

$ cd $VISP_WS
$ wget http://visp-doc.inria.fr/download/mbt-model/mmicro.zip
$ unzip ~/Downloads/mmicro.zip
$ cd $VISP_WS/visp-build/tutorial/tracking/model-based/generic
$ ./tutorial-mb-generic-tracker-full --video $VISP_WS/mmicro/video/mmicro00%2d.png --model $VISP_WS/mmicro/mmicro.cao

You should be able to obtain these kind of results:

Test tracker on lego model with a live camera

All the previous examples (teabox, CubeSAT, mmicro) were working with videos. We provide tutorial-mb-generic-tracker-live.cpp that allows to use a camera in order to test the tracker live.

This example tries to use the first grabber that is available in the following list:

By default, if you are on an Ubuntu like system that has libv4l-dev package installed, you should be able to grab images from a webcam without modifying the code.

To select an other grabber that corresponds to your camera you have to uncomment some lines at the beginning of tutorial-mb-generic-tracker-live.cpp tutorial:

//#undef VISP_HAVE_V4L2
//#undef VISP_HAVE_DC1394
//#undef VISP_HAVE_CMU1394
//#undef VISP_HAVE_FLYCAPTURE
//#undef VISP_HAVE_REALSENSE2
//#undef VISP_HAVE_OPENCV

For example to force the usage of vpRealSense2 class that allows to grab images from an Intel Realsense device like D435 or SR300, you should modify the code like:

$ cd $VISP_WS/visp/tutorial/tracking/model-based/generic
$ gedit tutorial-mb-generic-tracker-live.cpp
#undef VISP_HAVE_V4L2
#undef VISP_HAVE_DC1394
#undef VISP_HAVE_CMU1394
#undef VISP_HAVE_FLYCAPTURE
//#undef VISP_HAVE_REALSENSE2

Once modified, enter the build folder and build the tutorial:

$ cd $VISP_WS/visp-build/tutorial/tracking/model-based/generic

$ make tutorial-mb-generic-tracker-live

Let us now consider the object made with 4 lego 2x4 bricks described in How to load cao model with transformation.

Once build, to get the usage, run:

$ ./tutorial-mb-generic-tracker-live --help

To test the tracker on the lego-square object, run:

$ ./tutorial-mb-generic-tracker-live --model model/lego-square/lego-square.cao

You should be able to obtain these kind of results:

Now if the tracker is working, you can learn the object running:

$ ./tutorial-mb-generic-tracker-live --model model/lego-square/lego-square.cao --learn

Once initialized by 4 user clicks, use the left click to learn on one or two images and then the right click to quit. In the terminal you should see printings like:

...
Data learned
Data learned
Save learning from 2 images in file: learning/data-learned.bin

You can now use this learning to automize tracker initialization with:

$ ./tutorial-mb-generic-tracker-live --model model/lego-square/lego-square.cao --auto_init

Known issues

Model-based trackers examples are not working with Ogre visibility check

If you run mbtEdgeTracking.cpp, mbtKltTracking.cpp or mbtEdgeKltTracking.cpp examples enabling Ogre visibility check (using "-o" option), you may encounter the following issue:

C:\> mbtEdgeTracking.exe -c -o
...
OGRE EXCEPTION(6:FileNotFoundException): Cannot locate resource VTFInstancing.cg in resource group General
...
    Initializing OIS ***

and then a wonderful runtime issue as in the next image:

img-win8.1-msvc-mbtracker-ogre-issue.jpg

It means maybe that Ogre version is not compatible with DirectX 11. This can be checked adding "-w" option to the command line:

C:\> mbtEdgeTracking.exe -c -o -w

Now the binary should open the Ogre configuration window where you have to select "OpenGL Rendering Subsystem" instead of "Direct3D11 Rendering Subsystem". Press then OK to continue and start the tracking of the cube.

img-win8.1-msvc-mbtracker-ogre-opengl.jpg

Model-based trackers tutorials are not working with Ogre visibility check

This issue is similar to Model-based trackers examples are not working with Ogre visibility check. It may occur with tutorial-mb-edge-tracker.cpp, tutorial-mb-klt-tracker.cpp and tutorial-mb-hybrid-tracker.cpp. To make working the tutorials:

modify the code like:
tracker.setOgreVisibilityTest(true);
tracker.setOgreShowConfigDialog(true);
build the modified tutorial
run the binary. Now the binary should open the Ogre configuration window where you have to select an Ogre renderer that is working on your computer.
Press then OK to continue and start the tracking of the object.

Next tutorial

If you have a webcam, you are now ready to experiment the generic model-based tracker on a cube that has an AprilTag on one face following Tutorial: Markerless generic model-based tracking using AprilTag for initialization (use case).

There is also Tutorial: Object detection and localization to learn how to initialize the tracker without user click, by learning the object to track using keypoints when the object is textured. There is also Tutorial: Markerless generic model-based tracking using a stereo camera if you want to know how to extend the tracker to use a stereo camera or Tutorial: Markerless generic model-based tracking using a RGB-D camera if you want to extend the tracking by using depth as visual features. There is also this other Tutorial: Template tracking.

Table of Contents

Introduction

Features overview

Considered third-parties

Input images data

Getting started

Example input/output data

Example source code

Running the example

Source code explained

Tracker CAD model

teabox.cao example

teabox-triangle.cao example

teabox.wrl example

Tracker initialization

Initialization by user click

Initialization by external pose

Tracker settings

Settings from an XML file

Moving-edges settings

Keypoints settings

Camera settings

Visibility settings

Default visibility based on normals

Advanced visibility via Ogre3D

Advanced visibility via Scanline Rendering

Clipping settings

Advanced

How to detect tracking failures

How to manipulate the model

Level of detail (LOD)

CAD model in cao format

How to model faces from lines

How to model cylinders

How to model circles

How to change cao model origin

How to create a hierarchical model

How to set a name to a face

How to load cao model with transformation

How to tune the level of detail

CAD model in wrml format

How to set a name to a face

How not to consider specific polygons

How to save tracking results without vpDisplay

Use case

Test tracker on teabox model

Test tracker on CubeSAT satellite model

Test tracker on mmicro model

Test tracker on lego model with a live camera

Known issues

Model-based trackers examples are not working with Ogre visibility check

Model-based trackers tutorials are not working with Ogre visibility check

Next tutorial