Pose estimation for augmented reality
Pose from markerless model-based tracking

Table of Contents

Introduction

Various authors have proposed different formulations of the pose estimation problem which does not require the need of markers or keypoints matching process.

In ViSP such an algorithm is proposed. It allows to track an object using its cad model. Considered objects should be modeled by lines, circles or cylinders. The model of the object could be defined in vrml format, or in cao format. The markerless model-based tracker considered here can handle moving-edges behind the contour of the model.

The video below shows the result of a tea box model-based tracking.

Source code

The following source code also available in pose-mbt-visp.cpp allows to compute the pose of the camera wrt. the object that is tracked.

#include <visp/vpDisplayGDI.h>
#include <visp/vpDisplayOpenCV.h>
#include <visp/vpDisplayX.h>
#include <visp/vpImageIo.h>
#include <visp/vpIoTools.h>
#include <visp/vpMbEdgeTracker.h>
#include <visp/vpVideoReader.h>
int main(int argc, char** argv)
{
#if defined(VISP_HAVE_OPENCV) && (VISP_HAVE_OPENCV_VERSION >= 0x020100) || defined(VISP_HAVE_FFMPEG)
try {
std::string videoname = "teabox.mpg";
for (int i=0; i<argc; i++) {
if (std::string(argv[i]) == "--name")
videoname = std::string(argv[i+1]);
else if (std::string(argv[i]) == "--help") {
std::cout << "\nUsage: " << argv[0] << " [--name <video name>] [--help]\n" << std::endl;
return 0;
}
}
std::string parentname = vpIoTools::getParent(videoname);
std::string objectname = vpIoTools::getNameWE(videoname);
if(! parentname.empty())
objectname = parentname + "/" + objectname;
std::cout << "Video name: " << videoname << std::endl;
std::cout << "Tracker requested config files: " << objectname
<< ".[init,"
#ifdef VISP_HAVE_XML2
<< "xml,"
#endif
<< "cao or wrl]" << std::endl;
std::cout << "Tracker optional config files: " << objectname << ".[ppm]" << std::endl;
vpImage<unsigned char> I;
vpCameraParameters cam;
vpHomogeneousMatrix cTw;
vpVideoReader g;
g.setFileName(videoname);
g.open(I);
#if defined(VISP_HAVE_X11)
vpDisplayX display(I,100,100,"Model-based edge tracker");;
#elif defined(VISP_HAVE_GDI)
vpDisplayGDI display(I,100,100,"Model-based edge tracker");;
#elif defined(VISP_HAVE_OPENCV)
vpDisplayOpenCV display(I,100,100,"Model-based edge tracker");;
#else
std::cout << "No image viewer is available..." << std::endl;
#endif
vpMbEdgeTracker tracker;
bool usexml = false;
#ifdef VISP_HAVE_XML2
if(vpIoTools::checkFilename(objectname + ".xml")) {
tracker.loadConfigFile(objectname + ".xml");
usexml = true;
}
#endif
if (! usexml) {
vpMe me;
me.setMaskSize(5);
me.setMaskNumber(180);
me.setRange(8);
me.setThreshold(10000);
me.setMu1(0.5);
me.setMu2(0.5);
me.setSampleStep(4);
me.setNbTotalSample(250);
tracker.setMovingEdge(me);
cam.initPersProjWithoutDistortion(839, 839, 325, 243);
tracker.setCameraParameters(cam);
}
if(vpIoTools::checkFilename(objectname + ".cao"))
tracker.loadModel(objectname + ".cao");
else if(vpIoTools::checkFilename(objectname + ".wrl"))
tracker.loadModel(objectname + ".wrl");
tracker.setDisplayFeatures(true);
tracker.initClick(I, objectname + ".init", true);
while(! g.end()){
g.acquire(I);
vpDisplay::display(I);
tracker.track(I);
tracker.getPose(cTw);
tracker.getCameraParameters(cam);
tracker.display(I, cTw, cam, vpColor::red, 2);
vpDisplay::displayFrame(I, cTw, cam, 0.025, vpColor::none, 3);
vpDisplay::displayText(I, 10, 10, "A click to exit...", vpColor::red);
vpDisplay::flush(I);
if (vpDisplay::getClick(I, false))
break;
}
vpDisplay::getClick(I);
}
catch(vpException e) {
std::cout << "Catch an exception: " << e << std::endl;
}
#else
(void)argc;
(void)argv;
std::cout << "Install OpenCV or ffmpeg and rebuild ViSP to use this example." << std::endl;
#endif
}

Source code explained

Hereafter is the description of the most important lines in this example.

#include <visp/vpMbEdgeTracker.h>

Here we include the header of the vpMbEdgeTracker class that allows to track an object from its cad model using moving-edges. The tracker will use image I and the intrinsic camera parameters cam as input.

vpImage<unsigned char> I;
vpCameraParameters cam;

As output, it will estimate cTw, the pose of the object in the camera frame.

vpHomogeneousMatrix cTw;

Once the input image teabox.pgm is loaded in I, a window is created and initialized with image I. Then we create an instance of the tracker.

vpMbEdgeTracker tracker;

There are then two different ways to initialize the tracker.

Now we are ready to load the cad model of the object. ViSP supports cad model in cao format or in vrml format. The cao format is a particular format only supported by ViSP. It doesn't require an additional 3rd party rather then vrml format that require Coin 3rd party.

First we check if the file exists, then we load the cad model in cao format with:

if(vpIoTools::checkFilename(objectname + ".cao"))
tracker.loadModel(objectname + ".cao");

The file teabox.cao describes first the vertices of the box, then the edges that corresponds to the faces. A more complete description of this file is provided in CAD model in cao format. The next figure gives the index of the vertices that are defined in teabox.cao.

If the cad model in cao format doesn't exist, we check then if it exists in vrml format before loading:

else if(vpIoTools::checkFilename(objectname + ".wrl"))
tracker.loadModel(objectname + ".wrl");

As for the cao format, teabox.wrl describes first the vertices of the box, then the edges that corresponds to the faces. A more complete description of this file is provided in CAD model in vrml format.

img-teabox-cao.jpg
Index of the vertices used to model the tea box in cao format.

Once the model of the object to track is loaded, with the next line the display in the image window of additional drawings in overlay such as the moving edges positions, is then enabled by:

tracker.setDisplayFeatures(true);

Now we have to initialize the tracker. With the next line we choose to use a user interaction.

tracker.initClick(I, objectname + ".init", true);

The user has to click in the image on four vertices with their 3D coordinates defined in the "teabox.init" file. The following image shows where the user has to click.

img-teabox-click.jpg
Image "teabox.ppm" used to help the user to initialize the tracker.

Matched 2D and 3D coordinates are then used to compute an initial pose used to initialize the tracker. Note also that the third optional argument "true" is used here to enable the display of an image that may help the user for the initialization. The name of this image is the same as the "*.init" file except the extension that sould be ".ppm". In our case it will be "teabox.ppm".

The content of teabox.init file that defines 3D coordinates of some points of the model used during user intialization is provided hereafter. Note that all the characters after character '#' are considered as comments.

1 4 # Number of points
2 0 0 0 # Point 0
3 0.165 0 0 # Point 3
4 0.165 0 -0.08 # Point 2
5 0.165 0.068 -0.08 # Point 5

We give now the signification of each line of this file:

Here the user has to click on vertex 0, 3, 2 and 5 in the window that displays image I. From the 3D coordinates defined in teabox.init and the corresponding 2D coordinates of the vertices obtained by user interaction a pose is computed that is than used to initialize the tracker.

Next, in the infinite while loop, after displaying the next image, we track the object on a new image I.

tracker.track(I);

The result of the tracking is a pose cTw that could be obtained by:

tracker.getPose(cTw);

Next lines are used first to retrieve the camera parameters used by the tracker, then to display the visible part of the cad model using red lines with 2 as thickness, and finally to display the object frame at the estimated position cTw. Each axis of the frame are 0.025 meters long. Using vpColor::none indicates that x-axis is displayed in red, y-axis in green, while z-axis in blue. The thickness of the axis is 3.

tracker.getCameraParameters(cam);
tracker.display(I, cTw, cam, vpColor::red, 2);
vpDisplay::displayFrame(I, cTw, cam, 0.025, vpColor::none, 3);

Object modeling

CAD model in cao format

cao format is specific to ViSP. It allows to describe the CAD model of an object using a text file with extension .cao. The content of the file teabox.cao used in this example is given here:

1 V1
2 # 3D Points
3 8 # Number of points
4 0 0 0 # Point 0: X Y Z
5 0 0 -0.08
6 0.165 0 -0.08
7 0.165 0 0
8 0.165 0.068 0
9 0.165 0.068 -0.08
10 0 0.068 -0.08
11 0 0.068 0 # Point 7
12 # 3D Lines
13 0 # Number of lines
14 # Faces from 3D lines
15 0 # Number of faces
16 # Faces from 3D points
17 6 # Number of faces
18 4 0 1 2 3 # Face 0: [number of points] [index of the 3D points]...
19 4 1 6 5 2
20 4 4 5 6 7
21 4 0 3 4 7
22 4 5 4 3 2
23 4 0 7 6 1 # Face 5
24 # 3D cylinders
25 0 # Number of cylinders
26 # 3D circles
27 0 # Number of circles

This file describes the model of the tea box corresponding to the next image:

img-teabox-cao.jpg
Index of the vertices used to model the tea box in cao format.

We make the choice to describe the faces of the box from the 3D points that correspond to the vertices. We provide now a line by line description of the file. Notice that the characters after the '#' are considered as comments.

CAD model in vrml format

ViSP support vrml format only if Coin 3rd party is installed. This format allows to describe the CAD model of an object using a text file with extension .wrl. The content of the teabox.wrl file used in this example is given hereafter. This content is to make into relation with teabox.cao described in CAD model in cao format.

1 #VRML V2.0 utf8
2 
3 DEF fst_0 Group {
4 children [
5 
6 # Object "teabox"
7 Shape {
8 
9 geometry DEF cube IndexedFaceSet {
10 
11 coord Coordinate {
12 point [
13 0 0 0 ,
14 0 0 -0.08,
15 0.165 0 -0.08,
16 0.165 0 0 ,
17 0.165 0.068 0 ,
18 0.165 0.068 -0.08,
19 0 0.068 -0.08,
20 0 0.068 0 ]
21 }
22 
23 coordIndex [
24  0,1,2,3,-1,
25  1,6,5,2,-1,
26  4,5,6,7,-1,
27  0,3,4,7,-1,
28  5,4,3,2,-1,
29  0,7,6,1,-1]}
30 }
31 
32 ]
33 }

This file describes the model of the tea box corresponding to the next image:

img-teabox-cao.jpg
Index of the vertices used to model the tea box in vrml format.

We provide now a line by line description of the file where the faces of the box are defined from the vertices: