Visual Servoing Platform  version 3.6.1 under development (2024-08-14)
Tutorial: Homography estimation from points

# Introduction

This tutorial shows how to estimate an homography from points. Two cases are considered in the next sections:

# Homography estimation

Note that all the material (source code) described in this tutorial is part of ViSP source code (in `tutorial/computer-vision` folder) and could be found in https://github.com/lagadic/visp/tree/master/tutorial/computer-vision.

Let us consider the following source code also available in tutorial-homography-from-points.cpp. To resume, we do the following:

• define four 3D points `oP` in an object frame,
• define an homogeneous transformation `aMo` from frame `a` to to the object frame `o`,
• define a similar homogeneous transformation `bMo` from frame `b` to to the object frame `o`,
• compute the coordinates of the four 3D points in the image plane `a` and `b`. These are the matched coordinates (xa,ya) and (xb,yb) that are then used to estimate the homography using either vpHomography::DLT() or vpHomography::HLM().
#include <visp3/core/vpConfig.h>
#include <visp3/vision/vpHomography.h>
#include <visp3/core/vpMeterPixelConversion.h>
int main()
{
#ifdef ENABLE_VISP_NAMESPACE
using namespace VISP_NAMESPACE_NAME;
#endif
double L = 0.1;
std::vector<vpPoint> oP;
oP.push_back(vpPoint(-L, -L, 0));
oP.push_back(vpPoint(2 * L, -L, 0));
oP.push_back(vpPoint(L, 3 * L, 0));
oP.push_back(vpPoint(-L, 4 * L, 0));
vpHomogeneousMatrix bMo(0.1, 0, 1, 0, vpMath::rad(15), 0);
vpHomogeneousMatrix aMo = aMb * bMo;
std::vector<vpPoint> aP(4), bP(4);
std::vector<double> xa(4), ya(4), xb(4), yb(4);
for (unsigned int i = 0; i < 4; i++) {
oP[i].project(aMo);
xa[i] = oP[i].get_x();
ya[i] = oP[i].get_y();
oP[i].project(bMo);
xb[i] = oP[i].get_x();
yb[i] = oP[i].get_y();
}
vpHomography::DLT(xb, yb, xa, ya, aHb, true);
std::cout << "Estimated homography using DLT:\n" << aHb / aHb[2][2] << std::endl;
vpHomography::HLM(xb, yb, xa, ya, true, aHb);
std::cout << "Estimated homography using HLM:\n" << aHb / aHb[2][2] << std::endl;
aHb.computeDisplacement(aRb, atb, n);
std::cout << "\nEstimated displacement:" << std::endl;
std::cout << " atb: " << atb.t() << std::endl;
atub.build(aRb);
std::cout << " athetaub: ";
for (unsigned int i = 0; i < 3; i++)
std::cout << vpMath::deg(atub[i]) << " ";
std::cout << std::endl;
std::cout << " n: " << n.t() << std::endl;
vpImagePoint iPa, iPb;
vpMeterPixelConversion::convertPoint(cam, xb[3], yb[3], iPb);
vpMeterPixelConversion::convertPoint(cam, xa[3], ya[3], iPa);
std::cout << "Ground truth: Point 3 in pixels in frame b: " << iPb << std::endl;
std::cout << "Ground truth: Point 3 in pixels in frame a: " << iPa << std::endl;
// Project the position in pixel of point 3 from the homography
std::cout << "Estimation from homography: Point 3 in pixels in frame a: " << vpHomography::project(cam, aHb, iPb)
<< std::endl;
}
Generic class defining intrinsic camera parameters.
Implementation of column vector and the associated operations.
Definition: vpColVector.h:191
vpRowVector t() const
Implementation of an homogeneous matrix and operations on such kind of matrices.
Implementation of an homography and operations on homographies.
Definition: vpHomography.h:174
static void DLT(const std::vector< double > &xb, const std::vector< double > &yb, const std::vector< double > &xa, const std::vector< double > &ya, vpHomography &aHb, bool normalization=true)
void computeDisplacement(vpRotationMatrix &aRb, vpTranslationVector &atb, vpColVector &n)
static void HLM(const std::vector< double > &xb, const std::vector< double > &yb, const std::vector< double > &xa, const std::vector< double > &ya, bool isplanar, vpHomography &aHb)
static vpImagePoint project(const vpCameraParameters &cam, const vpHomography &bHa, const vpImagePoint &iPa)
Class that defines a 2D point in an image. This class is useful for image processing and stores only ...
Definition: vpImagePoint.h:82
Definition: vpMath.h:129
Definition: vpMath.h:119
static void convertPoint(const vpCameraParameters &cam, const double &x, const double &y, double &u, double &v)
Class that defines a 3D point in the object frame and allows forward projection of a 3D point in the ...
Definition: vpPoint.h:79
Implementation of a rotation matrix and operations on such kind of matrices.
Implementation of a rotation vector as axis-angle minimal representation.
vpThetaUVector & build(const vpHomogeneousMatrix &M)
Class that consider the case of a translation vector.
vpRowVector t() const

Now we give a line by line explanation of the code:

First we have to include the header of the vpHomography class.

#include <visp3/vision/vpHomography.h>

In the main() function we first define the 3D coordinates of 4 points that are localized in the plane Z=0:

double L = 0.1;
std::vector<vpPoint> oP;
oP.push_back(vpPoint(-L, -L, 0));
oP.push_back(vpPoint(2 * L, -L, 0));
oP.push_back(vpPoint(L, 3 * L, 0));
oP.push_back(vpPoint(-L, 4 * L, 0));

Then we define the homogeneous transformations between frames `a`, `b` and object frame `o:`

vpHomogeneousMatrix bMo(0.1, 0, 1, 0, vpMath::rad(15), 0);
vpHomogeneousMatrix aMo = aMb * bMo;

From these transformations we compute the coordinates of the points in the image plane `a` and `b`. For each point we have its coordinates (xa,ya) in frame `a` and (xb,yb) in frame `b:`

std::vector<vpPoint> aP(4), bP(4);
std::vector<double> xa(4), ya(4), xb(4), yb(4);
for (unsigned int i = 0; i < 4; i++) {
oP[i].project(aMo);
xa[i] = oP[i].get_x();
ya[i] = oP[i].get_y();
oP[i].project(bMo);
xb[i] = oP[i].get_x();
yb[i] = oP[i].get_y();
}

We have now matched couples of coordinates of four points that are used to estimate an homography between image plane `a` and `b`. Two methods are available, either using the DLT (Direct Linear Transform) algorithm or the HLM algorithm.

vpHomography::DLT(xb, yb, xa, ya, aHb, true);
std::cout << "Estimated homography using DLT:\n" << aHb / aHb[2][2] << std::endl;
vpHomography::HLM(xb, yb, xa, ya, true, aHb);
std::cout << "Estimated homography using HLM:\n" << aHb / aHb[2][2] << std::endl;
Note
Note that vpHomography::HLM() allows to consider points that are not coplanar.

Once the homography is estimated, the vpHomography class allows to extract the 3D homogeneous transformation between frames `a` and `b:`

aHb.computeDisplacement(aRb, atb, n);

Just for fun we print the values to this transformation using:

std::cout << "\nEstimated displacement:" << std::endl;
std::cout << " atb: " << atb.t() << std::endl;
atub.build(aRb);
std::cout << " athetaub: ";
for (unsigned int i = 0; i < 3; i++)
std::cout << vpMath::deg(atub[i]) << " ";
std::cout << std::endl;
std::cout << " n: " << n.t() << std::endl;

This code lead to the following output:

Estimated displacement:
atb: 0.2016519874 -0.1008259937 0.1008259937
athetaub: -3 20 5
n: 0.2588190451 -1.124100812e-14 0.9659258263

where we can see that the values for `atb` and `athetaub` are the one specified at the beginning of the source code during `aMb` initialization.

After we show how to retrieve the coordinates in pixels of a point (here point [3]) in the corresponding images using camera parameters:

vpImagePoint iPa, iPb;
vpMeterPixelConversion::convertPoint(cam, xb[3], yb[3], iPb);
vpMeterPixelConversion::convertPoint(cam, xa[3], ya[3], iPa);
std::cout << "Ground truth: Point 3 in pixels in frame b: " << iPb << std::endl;
std::cout << "Ground truth: Point 3 in pixels in frame a: " << iPa << std::endl;

At the end, we show how to project a point with pixel coordinates from image `b` to image `a` using the homography:

// Project the position in pixel of point 3 from the homography
std::cout << "Estimation from homography: Point 3 in pixels in frame a: " << vpHomography::project(cam, aHb, iPb)
<< std::endl;

This last part of the code produce the following output:

Ground truth: Point 3 in pixels in frame b: 377.9450564, 193.9928711
Ground truth: Point 3 in pixels in frame a: 353.8501593, 486.1851856
Estimation from homography: Point 3 in pixels in frame a: 353.8501593, 486.1851856

# Ransac or robust homography estimation

Note that all the material (source code) described in this tutorial is part of ViSP source code (in `tutorial/detection/matching` folder) and could be found in https://github.com/lagadic/visp/tree/master/tutorial/detection/matching.

This section follows the Tutorial: Keypoints matching. It explains how to exploit couples of matched points obtained using SURF detector in order to estimate an homography that allows to reject mismatched couples of points. The homography is then used to track a postcard from its initial position in the reference image.

Let us consider the following source code also available in tutorial-matching-keypoint-homography.cpp.

#include <visp3/core/vpConfig.h>
#include <visp3/core/vpPixelMeterConversion.h>
#include <visp3/gui/vpDisplayOpenCV.h>
#include <visp3/vision/vpHomography.h>
#include <visp3/vision/vpKeyPoint.h>
int main(int argc, const char **argv)
{
#if defined(HAVE_OPENCV_HIGHGUI) && defined(HAVE_OPENCV_IMGPROC) && defined(HAVE_OPENCV_FEATURES2D)
#ifdef ENABLE_VISP_NAMESPACE
using namespace VISP_NAMESPACE_NAME;
#endif
int method = 0;
if (argc > 1)
method = atoi(argv[1]);
if (method == 0)
std::cout << "Uses Ransac to estimate the homography" << std::endl;
else
std::cout << "Uses a robust scheme to estimate the homography" << std::endl;
const std::string detectorName = "ORB";
const std::string extractorName = "ORB";
// Hamming distance must be used with ORB
const std::string matcherName = "BruteForce-Hamming";
vpKeyPoint keypoint(detectorName, extractorName, matcherName, filterType);
keypoint.buildReference(I);
Idisp.resize(I.getHeight(), 2 * I.getWidth());
Idisp.insert(I, vpImagePoint(0, 0));
Idisp.insert(I, vpImagePoint(0, I.getWidth()));
vpDisplayOpenCV d(Idisp, 0, 0, "Homography from matched keypoints");
vpImagePoint corner_ref[4];
corner_ref[0].set_ij(115, 64);
corner_ref[1].set_ij(83, 253);
corner_ref[2].set_ij(282, 307);
corner_ref[3].set_ij(330, 72);
for (unsigned int i = 0; i < 4; i++) {
vpDisplay::displayCross(Idisp, corner_ref[i], 12, vpColor::red);
}
vpCameraParameters cam(840, 840, I.getWidth() / 2, I.getHeight() / 2);
vpHomography curHref;
Idisp.insert(I, vpImagePoint(0, I.getWidth()));
unsigned int nbMatch = keypoint.matchPoint(I);
std::cout << "Nb matches: " << nbMatch << std::endl;
std::vector<vpImagePoint> iPref(nbMatch),
iPcur(nbMatch); // Coordinates in pixels (for display only)
std::vector<double> mPref_x(nbMatch), mPref_y(nbMatch);
std::vector<double> mPcur_x(nbMatch), mPcur_y(nbMatch);
std::vector<bool> inliers(nbMatch);
for (unsigned int i = 0; i < nbMatch; i++) {
keypoint.getMatchedPoints(i, iPref[i], iPcur[i]);
vpPixelMeterConversion::convertPoint(cam, iPref[i], mPref_x[i], mPref_y[i]);
vpPixelMeterConversion::convertPoint(cam, iPcur[i], mPcur_x[i], mPcur_y[i]);
}
try {
double residual;
if (method == 0)
vpHomography::ransac(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual,
(unsigned int)(mPref_x.size() * 0.25), 2.0 / cam.get_px(), true);
else
vpHomography::robust(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual, 0.4, 4, true);
}
catch (...) {
std::cout << "Cannot compute homography from matches..." << std::endl;
}
vpImagePoint corner_cur[4];
for (int i = 0; i < 4; i++) {
corner_cur[i] = vpHomography::project(cam, curHref, corner_ref[i]);
}
vpImagePoint offset(0, I.getWidth());
for (int i = 0; i < 4; i++) {
vpDisplay::displayLine(Idisp, corner_cur[i] + offset, corner_cur[(i + 1) % 4] + offset, vpColor::blue, 3);
}
for (unsigned int i = 0; i < nbMatch; i++) {
if (inliers[i] == true)
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::green);
else
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::red);
}
if (vpDisplay::getClick(Idisp, false))
break;
}
#else
(void)argc;
(void)argv;
#endif
return EXIT_SUCCESS;
}
static const vpColor white
Definition: vpColor.h:212
static const vpColor red
Definition: vpColor.h:217
static const vpColor blue
Definition: vpColor.h:223
static const vpColor green
Definition: vpColor.h:220
The vpDisplayOpenCV allows to display image using the OpenCV library. Thus to enable this class OpenC...
static bool getClick(const vpImage< unsigned char > &I, bool blocking=true)
static void display(const vpImage< unsigned char > &I)
static void displayLine(const vpImage< unsigned char > &I, const vpImagePoint &ip1, const vpImagePoint &ip2, const vpColor &color, unsigned int thickness=1, bool segment=true)
static void displayCross(const vpImage< unsigned char > &I, const vpImagePoint &ip, unsigned int size, const vpColor &color, unsigned int thickness=1)
static void flush(const vpImage< unsigned char > &I)
static void robust(const std::vector< double > &xb, const std::vector< double > &yb, const std::vector< double > &xa, const std::vector< double > &ya, vpHomography &aHb, std::vector< bool > &inliers, double &residual, double weights_threshold=0.4, unsigned int niter=4, bool normalization=true)
static bool ransac(const std::vector< double > &xb, const std::vector< double > &yb, const std::vector< double > &xa, const std::vector< double > &ya, vpHomography &aHb, std::vector< bool > &inliers, double &residual, unsigned int nbInliersConsensus, double threshold, bool normalization=true)
void set_ij(double ii, double jj)
Definition: vpImagePoint.h:320
unsigned int getWidth() const
Definition: vpImage.h:242
void resize(unsigned int h, unsigned int w)
resize the image : Image initialization
Definition: vpImage.h:538
void insert(const vpImage< Type > &src, const vpImagePoint &topLeft)
Definition: vpImage.h:633
unsigned int getHeight() const
Definition: vpImage.h:181
Class that allows keypoints detection (and descriptors extraction) and matching thanks to OpenCV libr...
Definition: vpKeyPoint.h:221
vpFilterMatchingType
Definition: vpKeyPoint.h:225
@ ratioDistanceThreshold
Definition: vpKeyPoint.h:230
static void convertPoint(const vpCameraParameters &cam, const double &u, const double &v, double &x, double &y)
Class that enables to manipulate easily a video file or a sequence of images. As it inherits from the...
void acquire(vpImage< vpRGBa > &I)
void setFileName(const std::string &filename)

The command line allows to use either Ransac algorithm of a robust M-estimator approach:

% ./tutorial-matching-keypoint-homography 0 // to run Ransac
% ./tutorial-matching-keypoint-homography 1 // to run the robust M-estimator

Here after is the resulting video. The left image represents the reference image. The right images correspond to the successive images of the input video. All the green lines extremities represent the points that are well matched and used in the homography estimation process. All the red lines represent couple of matched points that are rejected by the robust estimator.

Now, let us explain the new lines that were introduced to estimate the homography.

First we detect the command line arguments to be able later to user either Ransac or the robust M-estimator:

int method = 0;
if (argc > 1)
method = atoi(argv[1]);
if (method == 0)
std::cout << "Uses Ransac to estimate the homography" << std::endl;
else
std::cout << "Uses a robust scheme to estimate the homography" << std::endl;

We also initialize the coordinates of the pixels in the reference image that correspond to the postcard corners. These coordinates were obtained after a user initial click. To simplify the code, we set directly the coordinates of the points:

vpImagePoint corner_ref[4];
corner_ref[0].set_ij(115, 64);
corner_ref[1].set_ij(83, 253);
corner_ref[2].set_ij(282, 307);
corner_ref[3].set_ij(330, 72);

Using these coordinates, we display red lines around the postcard:

for (unsigned int i = 0; i < 4; i++) {
vpDisplay::displayCross(Idisp, corner_ref[i], 12, vpColor::red);
}

We need also to define roughly the parameters of our camera:

vpCameraParameters cam(840, 840, I.getWidth() / 2, I.getHeight() / 2);

For each new image, once points are matched using:

unsigned int nbMatch = keypoint.matchPoint(I);
std::cout << "Nb matches: " << nbMatch << std::endl;

We allocate new containers useful for the homography estimation. The coordinates of the points in the reference image will be stored in (mPref_x, mPref_y), while the corresponding coordinates in the current image will be stored in (mPcur_x, mPcur_y). We also allocate `inliers` a vector of boolean that will indicate if a couple of point is an inlier or an outlier:

std::vector<double> mPref_x(nbMatch), mPref_y(nbMatch);
std::vector<double> mPcur_x(nbMatch), mPcur_y(nbMatch);
std::vector<bool> inliers(nbMatch);

To estimate the homography we need first to convert the points from pixel to meters:

vpPixelMeterConversion::convertPoint(cam, iPref[i], mPref_x[i], mPref_y[i]);
vpPixelMeterConversion::convertPoint(cam, iPcur[i], mPcur_x[i], mPcur_y[i]);

We can now estimate the homography using either Ransac or the robust M-estimator approach:

try {
double residual;
if (method == 0)
vpHomography::ransac(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual,
(unsigned int)(mPref_x.size() * 0.25), 2.0 / cam.get_px(), true);
else
vpHomography::robust(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual, 0.4, 4, true);
}
catch (...) {
std::cout << "Cannot compute homography from matches..." << std::endl;
}

For Ransac we consider that at least 50 percent of the points should be inliers (`mPref_x.size()/2`) to reach a consensus and that a couple of point is stated as an inlier if the reprojection error is lower than 2 pixels (`2.0/cam.get_px()`).

Then using the homography, we project the coordinates of the postcard corners in the current image:

vpImagePoint corner_cur[4];
for (int i = 0; i < 4; i++) {
corner_cur[i] = vpHomography::project(cam, curHref, corner_ref[i]);
}

We use these coordinates to draw blue lines around the postcard:

vpImagePoint offset(0, I.getWidth());
for (int i = 0; i < 4; i++) {
vpDisplay::displayLine(Idisp, corner_cur[i] + offset, corner_cur[(i + 1) % 4] + offset, vpColor::blue, 3);
}

Since the homography estimation updates the status of the couples of matched points as inliers or outliers, between the matched points we are able to draw green lines when they are inliers, or red lines when they are outliers.

for (unsigned int i = 0; i < nbMatch; i++) {
if (inliers[i] == true)
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::green);
else
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::red);
}