Visual Servoing Platform  version 3.3.0 under development (2020-02-17)
Tutorial: Homography estimation from points

Introduction

This tutorial shows how to estimate an homography from points. Two cases are considered in the next sections:

Homography estimation

Note that all the material (source code) described in this section is part of ViSP source code and could be downloaded using the following command:

$ svn export https://github.com/lagadic/visp.git/trunk/tutorial/computer-vision

Let us consider the following source code also available in tutorial-homography-from-points.cpp. To resume, we do the following:

  • define four 3D points oP in an object frame,
  • define an homogeneous transformation aMo from frame a to to the object frame o,
  • define a similar homogeneous transformation bMo from frame b to to the object frame o,
  • compute the coordinates of the four 3D points in the image plane a and b. These are the matched coordinates (xa,ya) and (xb,yb) that are then used to estimate the homography using either vpHomography::DLT() or vpHomography::HLM().
#include <visp3/vision/vpHomography.h>
#include <visp3/core/vpMeterPixelConversion.h>
int main()
{
double L = 0.1;
std::vector<vpPoint> oP;
oP.push_back(vpPoint(-L, -L, 0));
oP.push_back(vpPoint(2 * L, -L, 0));
oP.push_back(vpPoint(L, 3 * L, 0));
oP.push_back(vpPoint(-L, 4 * L, 0));
vpHomogeneousMatrix bMo(0.1, 0, 1, 0, vpMath::rad(15), 0);
vpHomogeneousMatrix aMb(0.2, -0.1, 0.1, vpMath::rad(-3), vpMath::rad(20), vpMath::rad(5));
vpHomogeneousMatrix aMo = aMb * bMo;
std::vector<vpPoint> aP(4), bP(4);
std::vector<double> xa(4), ya(4), xb(4), yb(4);
for (unsigned int i = 0; i < 4; i++) {
oP[i].project(aMo);
xa[i] = oP[i].get_x();
ya[i] = oP[i].get_y();
oP[i].project(bMo);
xb[i] = oP[i].get_x();
yb[i] = oP[i].get_y();
}
vpHomography::DLT(xb, yb, xa, ya, aHb, true);
std::cout << "Estimated homography using DLT:\n" << aHb / aHb[2][2] << std::endl;
vpHomography::HLM(xb, yb, xa, ya, true, aHb);
std::cout << "Estimated homography using HLM:\n" << aHb / aHb[2][2] << std::endl;
vpColVector n;
aHb.computeDisplacement(aRb, atb, n);
std::cout << "\nEstimated displacement:" << std::endl;
std::cout << " atb: " << atb.t() << std::endl;
atub.buildFrom(aRb);
std::cout << " athetaub: ";
for (unsigned int i = 0; i < 3; i++)
std::cout << vpMath::deg(atub[i]) << " ";
std::cout << std::endl;
std::cout << " n: " << n.t() << std::endl;
vpImagePoint iPa, iPb;
vpMeterPixelConversion::convertPoint(cam, xb[3], yb[3], iPb);
vpMeterPixelConversion::convertPoint(cam, xa[3], ya[3], iPa);
std::cout << "Ground truth: Point 3 in pixels in frame b: " << iPb << std::endl;
std::cout << "Ground truth: Point 3 in pixels in frame a: " << iPa << std::endl;
// Project the position in pixel of point 3 from the homography
std::cout << "Estimation from homography: Point 3 in pixels in frame a: " << vpHomography::project(cam, aHb, iPb)
<< std::endl;
}

Now we give a line by line explanation of the code:

First we have to include the header of the vpHomography class.

#include <visp3/vision/vpHomography.h>

In the main() function we first define the 3D coordinates of 4 points that are localized in the plane Z=0:

double L = 0.1;
std::vector<vpPoint> oP;
oP.push_back(vpPoint(-L, -L, 0));
oP.push_back(vpPoint(2 * L, -L, 0));
oP.push_back(vpPoint(L, 3 * L, 0));
oP.push_back(vpPoint(-L, 4 * L, 0));

Then we define the homogeneous transformations between frames a, b and object frame o:

vpHomogeneousMatrix bMo(0.1, 0, 1, 0, vpMath::rad(15), 0);
vpHomogeneousMatrix aMb(0.2, -0.1, 0.1, vpMath::rad(-3), vpMath::rad(20), vpMath::rad(5));
vpHomogeneousMatrix aMo = aMb * bMo;

From these transformations we compute the coordinates of the points in the image plane a and b. For each point we have its coordinates (xa,ya) in frame a and (xb,yb) in frame b:

std::vector<vpPoint> aP(4), bP(4);
std::vector<double> xa(4), ya(4), xb(4), yb(4);
for (unsigned int i = 0; i < 4; i++) {
oP[i].project(aMo);
xa[i] = oP[i].get_x();
ya[i] = oP[i].get_y();
oP[i].project(bMo);
xb[i] = oP[i].get_x();
yb[i] = oP[i].get_y();
}

We have now matched couples of coordinates of four points that are used to estimate an homography between image plane a and b. Two methods are available, either using the DLT (Direct Linear Transform) algorithm or the HLM algorithm.

vpHomography::DLT(xb, yb, xa, ya, aHb, true);
std::cout << "Estimated homography using DLT:\n" << aHb / aHb[2][2] << std::endl;
vpHomography::HLM(xb, yb, xa, ya, true, aHb);
std::cout << "Estimated homography using HLM:\n" << aHb / aHb[2][2] << std::endl;
Note
Note that vpHomography::HLM() allows to consider points that are not coplanar.

Once the homography is estimated, the vpHomography class allows to extract the 3D homogeneous transformation between frames a and b:

vpColVector n;
aHb.computeDisplacement(aRb, atb, n);

Just for fun we print the values to this transformation using:

std::cout << "\nEstimated displacement:" << std::endl;
std::cout << " atb: " << atb.t() << std::endl;
atub.buildFrom(aRb);
std::cout << " athetaub: ";
for (unsigned int i = 0; i < 3; i++)
std::cout << vpMath::deg(atub[i]) << " ";
std::cout << std::endl;
std::cout << " n: " << n.t() << std::endl;

This code lead to the following output:

Estimated displacement:
atb: 0.2016519874 -0.1008259937 0.1008259937
athetaub: -3 20 5
n: 0.2588190451 -1.124100812e-14 0.9659258263

where we can see that the values for atb and athetaub are the one specified at the beginning of the source code during aMb initialization.

After we show how to retrieve the coordinates in pixels of a point (here point [3]) in the corresponding images using camera parameters:

vpImagePoint iPa, iPb;
vpMeterPixelConversion::convertPoint(cam, xb[3], yb[3], iPb);
vpMeterPixelConversion::convertPoint(cam, xa[3], ya[3], iPa);
std::cout << "Ground truth: Point 3 in pixels in frame b: " << iPb << std::endl;
std::cout << "Ground truth: Point 3 in pixels in frame a: " << iPa << std::endl;

At the end, we show how to project a point with pixel coordinates from image b to image a using the homography:

// Project the position in pixel of point 3 from the homography
std::cout << "Estimation from homography: Point 3 in pixels in frame a: " << vpHomography::project(cam, aHb, iPb)
<< std::endl;

This last part of the code produce the following output:

Ground truth: Point 3 in pixels in frame b: 377.9450564, 193.9928711
Ground truth: Point 3 in pixels in frame a: 353.8501593, 486.1851856
Estimation from homography: Point 3 in pixels in frame a: 353.8501593, 486.1851856

Ransac or robust homography estimation

Note that all the material (source code) described in this section is part of ViSP source code and could be downloaded using the following command:

$ svn export https://github.com/lagadic/visp.git/trunk/tutorial/detection/matching

This section follows the Tutorial: Keypoints matching. It explains how to exploit couples of matched points obtained using SURF detector in order to estimate an homography that allows to reject mismatched couples of points. The homography is then used to track a postcard from its initial position in the reference image.

Let us consider the following source code also available in tutorial-matching-keypoint-homography.cpp.

#include <visp3/core/vpPixelMeterConversion.h>
#include <visp3/gui/vpDisplayOpenCV.h>
#include <visp3/io/vpVideoReader.h>
#include <visp3/vision/vpHomography.h>
#include <visp3/vision/vpKeyPoint.h>
int main(int argc, const char **argv)
{
#if (VISP_HAVE_OPENCV_VERSION >= 0x020101)
int method = 0;
if (argc > 1)
method = atoi(argv[1]);
if (method == 0)
std::cout << "Uses Ransac to estimate the homography" << std::endl;
else
std::cout << "Uses a robust scheme to estimate the homography" << std::endl;
vpVideoReader reader;
reader.setFileName("video-postcard.mpeg");
reader.acquire(I);
const std::string detectorName = "ORB";
const std::string extractorName = "ORB";
// Hamming distance must be used with ORB
const std::string matcherName = "BruteForce-Hamming";
vpKeyPoint keypoint(detectorName, extractorName, matcherName, filterType);
keypoint.buildReference(I);
Idisp.resize(I.getHeight(), 2 * I.getWidth());
Idisp.insert(I, vpImagePoint(0, 0));
Idisp.insert(I, vpImagePoint(0, I.getWidth()));
vpDisplayOpenCV d(Idisp, 0, 0, "Homography from matched keypoints");
vpImagePoint corner_ref[4];
corner_ref[0].set_ij(115, 64);
corner_ref[1].set_ij(83, 253);
corner_ref[2].set_ij(282, 307);
corner_ref[3].set_ij(330, 72);
for (unsigned int i = 0; i < 4; i++) {
vpDisplay::displayCross(Idisp, corner_ref[i], 12, vpColor::red);
}
vpCameraParameters cam(840, 840, I.getWidth() / 2, I.getHeight() / 2);
vpHomography curHref;
while (!reader.end()) {
reader.acquire(I);
Idisp.insert(I, vpImagePoint(0, I.getWidth()));
unsigned int nbMatch = keypoint.matchPoint(I);
std::cout << "Nb matches: " << nbMatch << std::endl;
std::vector<vpImagePoint> iPref(nbMatch),
iPcur(nbMatch); // Coordinates in pixels (for display only)
std::vector<double> mPref_x(nbMatch), mPref_y(nbMatch);
std::vector<double> mPcur_x(nbMatch), mPcur_y(nbMatch);
std::vector<bool> inliers(nbMatch);
for (unsigned int i = 0; i < nbMatch; i++) {
keypoint.getMatchedPoints(i, iPref[i], iPcur[i]);
vpPixelMeterConversion::convertPoint(cam, iPref[i], mPref_x[i], mPref_y[i]);
vpPixelMeterConversion::convertPoint(cam, iPcur[i], mPcur_x[i], mPcur_y[i]);
}
try {
double residual;
if (method == 0)
vpHomography::ransac(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual,
(unsigned int)(mPref_x.size() * 0.25), 2.0 / cam.get_px(), true);
else
vpHomography::robust(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual, 0.4, 4, true);
} catch (...) {
std::cout << "Cannot compute homography from matches..." << std::endl;
}
vpImagePoint corner_cur[4];
for (int i = 0; i < 4; i++) {
corner_cur[i] = vpHomography::project(cam, curHref, corner_ref[i]);
}
vpImagePoint offset(0, I.getWidth());
for (int i = 0; i < 4; i++) {
vpDisplay::displayLine(Idisp, corner_cur[i] + offset, corner_cur[(i + 1) % 4] + offset, vpColor::blue, 3);
}
for (unsigned int i = 0; i < nbMatch; i++) {
if (inliers[i] == true)
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::green);
else
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::red);
}
if (vpDisplay::getClick(Idisp, false))
break;
}
#else
(void)argc;
(void)argv;
#endif
return 0;
}

The command line allows to use either Ransac algorithm of a robust M-estimator approach:

% ./tutorial-matching-keypoint-homography 0 // to run Ransac
% ./tutorial-matching-keypoint-homography 1 // to run the robust M-estimator

Here after is the resulting video. The left image represents the reference image. The right images correspond to the successive images of the input video. All the green lines extremities represent the points that are well matched and used in the homography estimation process. All the red lines represent couple of matched points that are rejected by the robust estimator.

Now, let us explain the new lines that were introduced to estimate the homography.

First we detect the command line arguments to be able later to user either Ransac or the robust M-estimator:

int method = 0;
if (argc > 1)
method = atoi(argv[1]);
if (method == 0)
std::cout << "Uses Ransac to estimate the homography" << std::endl;
else
std::cout << "Uses a robust scheme to estimate the homography" << std::endl;

We also initialize the coordinates of the pixels in the reference image that correspond to the postcard corners. These coordinates were obtained after a user initial click. To simplify the code, we set directly the coordinates of the points:

vpImagePoint corner_ref[4];
corner_ref[0].set_ij(115, 64);
corner_ref[1].set_ij(83, 253);
corner_ref[2].set_ij(282, 307);
corner_ref[3].set_ij(330, 72);

Using these coordinates, we display red lines around the postcard:

for (unsigned int i = 0; i < 4; i++) {
vpDisplay::displayCross(Idisp, corner_ref[i], 12, vpColor::red);
}

We need also to define roughly the parameters of our camera:

vpCameraParameters cam(840, 840, I.getWidth() / 2, I.getHeight() / 2);

For each new image, once points are matched using:

unsigned int nbMatch = keypoint.matchPoint(I);
std::cout << "Nb matches: " << nbMatch << std::endl;

We allocate new containers useful for the homography estimation. The coordinates of the points in the reference image will be stored in (mPref_x, mPref_y), while the corresponding coordinates in the current image will be stored in (mPcur_x, mPcur_y). We also allocate inliers a vector of boolean that will indicate if a couple of point is an inlier or an outlier:

std::vector<double> mPref_x(nbMatch), mPref_y(nbMatch);
std::vector<double> mPcur_x(nbMatch), mPcur_y(nbMatch);
std::vector<bool> inliers(nbMatch);

To estimate the homography we need first to convert the points from pixel to meters:

vpPixelMeterConversion::convertPoint(cam, iPref[i], mPref_x[i], mPref_y[i]);
vpPixelMeterConversion::convertPoint(cam, iPcur[i], mPcur_x[i], mPcur_y[i]);

We can now estimate the homography using either Ransac or the robust M-estimator approach:

try {
double residual;
if (method == 0)
vpHomography::ransac(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual,
(unsigned int)(mPref_x.size() * 0.25), 2.0 / cam.get_px(), true);
else
vpHomography::robust(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual, 0.4, 4, true);
} catch (...) {
std::cout << "Cannot compute homography from matches..." << std::endl;
}

For Ransac we consider that at least 50 percent of the points should be inliers (mPref_x.size()/2) to reach a consensus and that a couple of point is stated as an inlier if the reprojection error is lower than 2 pixels (2.0/cam.get_px()).

Then using the homography, we project the coordinates of the postcard corners in the current image:

vpImagePoint corner_cur[4];
for (int i = 0; i < 4; i++) {
corner_cur[i] = vpHomography::project(cam, curHref, corner_ref[i]);
}

We use these coordinates to draw blue lines arround the postcard:

vpImagePoint offset(0, I.getWidth());
for (int i = 0; i < 4; i++) {
vpDisplay::displayLine(Idisp, corner_cur[i] + offset, corner_cur[(i + 1) % 4] + offset, vpColor::blue, 3);
}

Since the homography estimation updates the status of the couples of matched points as inliers or outliers, between the matched points we are able to draw green lines when they are inliers, or red lines when they are outliers.

for (unsigned int i = 0; i < nbMatch; i++) {
if (inliers[i] == true)
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::green);
else
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::red);
}