1. Introduction
This tutorial shows how to detect one or more AprilTag or ArUco markers with ViSP. To this end, we provide vpDetectorAprilTag class that is a wrapper over Apriltag 3rd party library. Notice that there is no need to install this 3rd party, since AprilTag source code is embedded in ViSP.
The vpDetectorAprilTag class inherits from vpDetectorBase class, a generic class dedicated to detection. For each detected tag, it allows retrieving some characteristics such as the tag id, and in the image, the polygon that contains the tag and corresponds to its 4 corner coordinates, the bounding box and the center of gravity of the tag.
Moreover, vpDetectorAprilTag class allows estimating the 3D pose of the tag. To this end, the camera parameters as well as the size of the tag are required.
- Note
- The vpDetectorAprilTag class can also decode tags from the following ArUco families (similar to OpenCV):
-
The internal fiducial markers detection method is still based on the AprilTag [25] method. It means that detecting ArUco markers will benefit from the corners location extraction precision, at a more expensive computation time cost and narrower detection range compared to ArUco method.
-
You can use the vpDetectorAprilTag::vpAprilTagFamily::TAG_CUSTOM48h12 (you will need to enable the WITH_APRILTAG_BIG_FAMILY CMake variable) for recursive tags, that is a tag inside another one. This would allow you for instance to detect a marker at a great distance and be able to servo your robot until a close range.
- Warning
- The size of the tag corresponds to the size of the black square containing the tag. To be usable with ViSP, tags must be printed with a white border. This white border should be as wide as the black border like in the next image:
In the next sections you will find examples that show how to detect tags in a single image or in images acquired from a camera connected to your computer.
Note that all the material (source code and image) described in this tutorial is part of ViSP source code (in tutorial/detection/tag folder) and could be found in https://github.com/lagadic/visp/tree/master/tutorial/detection/tag.
2. Print an AprilTag marker
2.1. Existing web app
To generate quickly an AprilTag or an ArUco marker you may use https://chev.me/arucogen/.
2.2. Ready to print 36h11 tag
We provide also a ready to print 36h11 tag that is 9.5 by 9.5 cm square [download].
2.3. Download pre-generated tag families
If you prefer, you can also directly download on the apriltag-imgs GitHub repository some pre-generated tag families. You will then need to scale the images as described below.
Another resources link is the Apriltag website for:
In each archive you will find a PNG image of each tag, a mosaic in PNG containing every tag and a ready-to-print postscript file with one tag per page. If you want to print an individual tag, you can manually scale the corresponding PNG image using two methods:
- on Unix with ImageMagick, e.g.:
$ convert tag36_11_00000.png -scale 5000% tag36_11_00000_big.png
- or open the image with Gimp:
- then from the pulldown menu, select Image > Scale Image
- set the unit and the size
- set the Interpolation mode to None
- click on the Scale button
- From the pulldown menu, select File > Export As
- Save the image as a new PNG image, e.g., /tmp/tag36_11_00000-rescaled.png
- Send the PNG file to your printer
2.4. Create your own tag image
We provide also tutorial-create-tag-image binary corresponding to tutorial-create-tag-image.cpp source code that allows to create an image of a given tag family.
$ cd ${VISP_WS}/visp-build/tutorial/detection/tag
$ ./tutorial-create-tag-image --output tag-36h11.png --tag-family 0 --tag-id 100 --tag-size 400 --display
Create marker
Family: 36h11
Id : 100
Size : 400 x 400 pixels
Saved in: tag-36h11.png
Click in the image to quit...
will create the following tag-36h11.png image which size is 400 x 400 pixels and that contains a 36h11 tag with Id 100 surrounded with a white border.
36h11 tag with Id 100 generated by ViSP tutorial-create-tag-image.cpp code.
3. AprilTag detection and pose estimation
3.1. Process single image
3.1.1. Use case
Once build, in "${VISP_WS}/visp-build/tutorial/detection/tag/" folder you will find tutorial-apriltag-detector binary corresponding to tutorial-apriltag-detector.cpp source code.
This tutorial detects a given tag family on a single image.
The default behavior is to detect 36h11 marker in AprilTag.jpg image, but "--tag-family <family>" option allows considering other tags and "--input <filename>". To see which are the options, just run:
$ ./tutorial-apriltag-detector --help
To detect multiple 36h11 tags in the AprilTag.jpg image that is provided just run:
$ ./tutorial-apriltag-detector
You will get the following result:
After a user click in the image, you will get the following image where the frames correspond to the 3D pose of each tag.
3.1.2. Code explained
Now we explain the main lines of the source code available in tutorial-apriltag-detector.cpp
#include <visp3/core/vpConfig.h>
#include <visp3/detection/vpDetectorAprilTag.h>
#include <visp3/core/vpXmlParserCamera.h>
#include <visp3/gui/vpDisplayFactory.h>
#include <visp3/io/vpImageIo.h>
void usage(const char **argv, int error);
void usage(const char **argv, int error)
{
std::cout << "Synopsis" << std::endl
<< " " << argv[0]
<< " [--input <filename>]"
<< " [--tag-size <size>]"
<< " [--tag-family <family>]"
<< " [--tag-decision-margin-threshold <threshold>]"
<< " [--tag-hamming-distance-threshold <threshold>]"
<< " [--tag-quad-decimate <factor>]"
<< " [--tag-n-threads <number>]"
<< " [--tag-z-aligned]"
<< " [--tag-pose-method <method>]"
#if defined(VISP_HAVE_PUGIXML)
<< " [--intrinsic <xmlfile>]"
<< " [--camera <name>]"
#endif
#if defined(VISP_HAVE_DISPLAY)
<< " [--display-tag]"
<< " [--color <id>]"
<< " [--thickness <thickness>"
#endif
<< " [--help, -h]" << std::endl
<< std::endl;
std::cout << "Description" << std::endl
<< " Detect AprilTags in an image and compute their corresponding pose." << std::endl
<< std::endl
<< " --input <filename>" << std::endl
<< " Image filename to process." << std::endl
<< " Default: AprilTag.jpg" << std::endl
<< std::endl
<< " --tag-size <size>" << std::endl
<< " Apriltag size in [m]." << std::endl
<< " Default: 0.03" << std::endl
<< std::endl
<< " --tag-family <family>" << std::endl
<< " Apriltag family. Supported values are:" << std::endl
<< " 0: TAG_36h11" << std::endl
<< " 1: TAG_36h10 (DEPRECATED)" << std::endl
<< " 2: TAG_36ARTOOLKIT (DEPRECATED)" << std::endl
<< " 3: TAG_25h9" << std::endl
<< " 4: TAG_25h7 (DEPRECATED)" << std::endl
<< " 5: TAG_16h5" << std::endl
<< " 6: TAG_CIRCLE21h7" << std::endl
<< " 7: TAG_CIRCLE49h12" << std::endl
<< " 8: TAG_CUSTOM48h12" << std::endl
<< " 9: TAG_STANDARD41h12" << std::endl
<< " 10: TAG_STANDARD52h13" << std::endl
<< " 11: TAG_ARUCO_4x4_50" << std::endl
<< " 12: TAG_ARUCO_4x4_100" << std::endl
<< " 13: TAG_ARUCO_4x4_250" << std::endl
<< " 14: TAG_ARUCO_4x4_1000" << std::endl
<< " 15: TAG_ARUCO_5x5_50" << std::endl
<< " 16: TAG_ARUCO_5x5_100" << std::endl
<< " 17: TAG_ARUCO_5x5_250" << std::endl
<< " 18: TAG_ARUCO_5x5_1000" << std::endl
<< " 19: TAG_ARUCO_6x6_50" << std::endl
<< " 20: TAG_ARUCO_6x6_100" << std::endl
<< " 21: TAG_ARUCO_6x6_250" << std::endl
<< " 22: TAG_ARUCO_6x6_1000" << std::endl
<< " 23: TAG_ARUCO_7x7_50" << std::endl
<< " 24: TAG_ARUCO_7x7_100" << std::endl
<< " 25: TAG_ARUCO_7x7_250" << std::endl
<< " 26: TAG_ARUCO_7x7_1000" << std::endl
<< " 27: TAG_ARUCO_MIP_36h12" << std::endl
<< " Default: 0 (36h11)" << std::endl
<< std::endl
<< " --tag-decision-margin-threshold <threshold>" << std::endl
<< " Threshold used to discard low-confident detections. A typical value is " << std::endl
<< " around 100. The higher this value, the more false positives will be filtered" << std::endl
<< " out. When this value is set to -1, false positives are not filtered out." << std::endl
<< " Default: 50" << std::endl
<< std::endl
<< " --tag-hamming-distance-threshold <threshold>" << std::endl
<< " Threshold used to discard low-confident detections with corrected bits." << std::endl
<< " A typical value is between 0 and 3. The lower this value, the more false" << std::endl
<< " positives will be filtered out." << std::endl
<< " Default: 0" << std::endl
<< std::endl
<< " --tag-quad-decimate <factor>" << std::endl
<< " Decimation factor used to detect a tag. " << std::endl
<< " Default: 1" << std::endl
<< std::endl
<< " --tag-n-threads <number>" << std::endl
<< " Number of threads used to detect a tag." << std::endl
<< " Default: 1" << std::endl
<< std::endl
<< " --tag-z-aligned" << std::endl
<< " When enabled, tag z-axis and camera z-axis are aligned." << std::endl
<< " Default: false" << std::endl
<< std::endl
<< " --tag-pose-method <method>" << std::endl
<< " Algorithm used to compute the tag pose from its 4 corners." << std::endl
<< " Possible values are:" << std::endl
<< " 0: HOMOGRAPHY" << std::endl
<< " 1: HOMOGRAPHY_VIRTUAL_VS" << std::endl
<< " 2: DEMENTHON_VIRTUAL_VS" << std::endl
<< " 3: LAGRANGE_VIRTUAL_VS" << std::endl
<< " 4: BEST_RESIDUAL_VIRTUAL_VS" << std::endl
<< " 5: HOMOGRAPHY_ORTHOGONAL_ITERATION" << std::endl
<< " Default: 1 (HOMOGRAPHY_VIRTUAL_VS)" << std::endl
<< std::endl
#if defined(VISP_HAVE_PUGIXML)
<< " --intrinsic <xmlfile>" << std::endl
<< " Camera intrinsic parameters file in xml format." << std::endl
<< " Default: empty" << std::endl
<< std::endl
<< " --camera <name>" << std::endl
<< " Camera name in the intrinsic parameters file in xml format." << std::endl
<< " Default: empty" << std::endl
<< std::endl
#endif
#if defined(VISP_HAVE_DISPLAY)
<< " --display-tag" << std::endl
<< " Flag used to enable displaying the edges of a tag." << std::endl
<< " Default: disabled" << std::endl
<< std::endl
<< " --color <id>" << std::endl
<< " Color id used to display the frame over each tag." << std::endl
<< " Possible values are:" << std::endl
<< " -1: R-G-B colors for X, Y, Z axis respectively" << std::endl
<< " 0: all axis in black" << std::endl
<< " 1: all axis in white" << std::endl
<< " ..." << std::endl
<< " Default: -1" << std::endl
<< std::endl
<< " --thickness <thickness>" << std::endl
<< " Thickness of the drawings in overlay." << std::endl
<< " Default: 2" << std::endl
#endif
<< std::endl
<< " --help, -h" << std::endl
<< " Print this helper message." << std::endl
<< std::endl;
if (error) {
std::cout << "Error" << std::endl
<< " "
<< "Unsupported parameter " << argv[error] << std::endl;
}
}
int main(int argc, const char **argv)
{
#if defined(VISP_HAVE_APRILTAG) && defined(VISP_HAVE_DISPLAY)
#ifdef ENABLE_VISP_NAMESPACE
#endif
std::string opt_input_filename = "AprilTag.jpg";
double opt_tag_size = 0.053;
float opt_tag_quad_decimate = 1.0;
float opt_tag_decision_margin_threshold = 50;
int opt_tag_hamming_distance_threshold = 2;
int opt_tag_nThreads = 1;
std::string opt_intrinsic_file = "";
std::string opt_camera_name = "";
bool opt_display_tag = false;
int opt_color_id = -1;
unsigned int opt_thickness = 2;
bool opt_tag_z_align_frame = false;
for (int i = 1; i < argc; ++i) {
if (std::string(argv[i]) == "--input" && i + 1 < argc) {
opt_input_filename = std::string(argv[++i]);
}
else if (std::string(argv[i]) == "--tag-size" && i + 1 < argc) {
opt_tag_size = atof(argv[++i]);
}
else if (std::string(argv[i]) == "--tag-family" && i + 1 < argc) {
}
else if (std::string(argv[i]) == "--tag-quad-decimate" && i + 1 < argc) {
opt_tag_quad_decimate = static_cast<float>(atof(argv[++i]));
}
else if (std::string(argv[i]) == "--tag-n-threads" && i + 1 < argc) {
opt_tag_nThreads = atoi(argv[++i]);
}
else if (std::string(argv[i]) == "--tag-z-aligned") {
opt_tag_z_align_frame = true;
}
else if (std::string(argv[i]) == "--tag-pose-method" && i + 1 < argc) {
}
else if (std::string(argv[i]) == "--tag-decision-margin-threshold" && i + 1 < argc) {
opt_tag_decision_margin_threshold = static_cast<float>(atof(argv[++i]));
}
else if (std::string(argv[i]) == "--tag-hamming-distance-threshold" && i + 1 < argc) {
opt_tag_hamming_distance_threshold = atoi(argv[++i]);
}
#if defined(VISP_HAVE_PUGIXML)
else if (std::string(argv[i]) == "--intrinsic" && i + 1 < argc) {
opt_intrinsic_file = std::string(argv[++i]);
}
else if (std::string(argv[i]) == "--camera-name" && i + 1 < argc) {
opt_camera_name = std::string(argv[++i]);
}
#endif
#if defined(VISP_HAVE_DISPLAY)
else if (std::string(argv[i]) == "--display-tag") {
opt_display_tag = true;
}
else if (std::string(argv[i]) == "--color" && i + 1 < argc) {
opt_color_id = atoi(argv[++i]);
}
else if (std::string(argv[i]) == "--thickness" && i + 1 < argc) {
opt_thickness = static_cast<unsigned int>(atoi(argv[++i]));
}
#endif
else if (std::string(argv[i]) == "--help" || std::string(argv[i]) == "-h") {
usage(argv, 0);
return EXIT_SUCCESS;
}
else {
usage(argv, i);
return EXIT_FAILURE;
}
}
std::cout << "Input data" << std::endl;
std::cout << " Image : " << opt_input_filename << std::endl;
cam.initPersProjWithoutDistortion(615.1674805, 615.1675415, 312.1889954, 243.4373779);
#if defined(VISP_HAVE_PUGIXML)
if (!opt_intrinsic_file.empty() && !opt_camera_name.empty()) {
std::cout << " Intrinsics : " << opt_intrinsic_file << std::endl << std::endl;
}
else {
std::cout << " Intrinsics : default" << std::endl << std::endl;
}
#else
std::cout << " Intrinsics : default" << std::endl << std::endl;
#endif
std::cout <<
cam << std::endl;
std::cout << "Tag detector settings" << std::endl;
std::cout << " Tag size [m] : " << opt_tag_size << std::endl;
std::cout << " Tag family : " << opt_tag_family << std::endl;
std::cout << " Quad decimate : " << opt_tag_quad_decimate << std::endl;
std::cout << " Decision margin threshold : " << opt_tag_decision_margin_threshold << std::endl;
std::cout << " Hamming distance threshold: " << opt_tag_hamming_distance_threshold << std::endl;
std::cout << " Num threads : " << opt_tag_nThreads << std::endl;
std::cout << " Z aligned : " << opt_tag_z_align_frame << std::endl;
std::cout << " Pose estimation : " << opt_tag_pose_estimation_method << std::endl;
#if (VISP_CXX_STANDARD >= VISP_CXX_STANDARD_11)
std::shared_ptr<vpDisplay> display, display2;
#else
#endif
try {
#if (VISP_CXX_STANDARD >= VISP_CXX_STANDARD_11)
#else
#endif
detector.setAprilTagQuadDecimate(opt_tag_quad_decimate);
detector.setAprilTagPoseEstimationMethod(opt_tag_pose_estimation_method);
detector.setAprilTagNbThreads(opt_tag_nThreads);
detector.setZAlignedWithCameraAxis(opt_tag_z_align_frame);
detector.setAprilTagDecisionMarginThreshold(opt_tag_decision_margin_threshold);
detector.setAprilTagHammingDistanceThreshold(opt_tag_hamming_distance_threshold);
std::vector<vpHomogeneousMatrix> cMo_vec;
detector.detect(I, opt_tag_size, cam, cMo_vec);
std::stringstream ss;
ss <<
"Detection time: " <<
t <<
" ms for " << detector.getNbObjects() <<
" tags";
std::vector<int> tag_ids = detector.getTagsId();
std::vector<float> tag_decision_margins = detector.getTagsDecisionMargin();
std::vector<int> tag_hamming_distances = detector.getTagsHammingDistance();
std::cout << "\nDetected tags" << std::endl;
for (
size_t i = 0;
i < detector.getNbObjects();
i++) {
std::vector<vpImagePoint>
p = detector.getPolygon(i);
vpRect bbox = detector.getBBox(i);
std::string message = detector.getMessage(i);
ss.str("");
ss << message <<
" with decision margin: " << tag_decision_margins[
i] <<
" and hamming distance: " << tag_hamming_distances[
i];
std::cout << " " << ss.str() << std::endl;
ss.str("");
ss <<
"Tag id: " << tag_ids[
i] <<
" - " << tag_decision_margins[
i];
for (
size_t j = 0;
j <
p.size();
j++) {
std::ostringstream number;
}
}
for (
size_t i = 0;
i < cMo_vec.size();
i++) {
}
#if (VISP_CXX_STANDARD >= VISP_CXX_STANDARD_11)
#else
#endif
std::vector<std::vector<vpImagePoint> > tagsCorners = detector.getTagsCorners();
detector.displayFrames(I_color, cMo_vec, cam, opt_tag_size / 2,
vpColor::none, 3);
}
std::cerr <<
"Catch an exception: " <<
e.getMessage() << std::endl;
}
#if (VISP_CXX_STANDARD < VISP_CXX_STANDARD_11)
if (display != nullptr) {
delete display;
}
if (display2 != nullptr) {
delete display2;
}
#endif
#else
(void)argc;
(void)argv;
#endif
return EXIT_SUCCESS;
}
Generic class defining intrinsic camera parameters.
@ perspectiveProjWithoutDistortion
Perspective projection without distortion model.
static vpColor getColor(const unsigned int &i)
static const vpColor none
static const vpColor blue
static const vpColor green
@ TAG_36h11
AprilTag 36h11 pattern (recommended).
Class that defines generic functionalities for display.
static bool getClick(const vpImage< unsigned char > &I, bool blocking=true)
static void display(const vpImage< unsigned char > &I)
static void displayFrame(const vpImage< unsigned char > &I, const vpHomogeneousMatrix &cMo, const vpCameraParameters &cam, double size, const vpColor &color=vpColor::none, unsigned int thickness=1, const vpImagePoint &offset=vpImagePoint(0, 0), const std::string &frameName="", const vpColor &textColor=vpColor::black, const vpImagePoint &textOffset=vpImagePoint(15, 15))
static void displayCross(const vpImage< unsigned char > &I, const vpImagePoint &ip, unsigned int size, const vpColor &color, unsigned int thickness=1)
static void close(vpImage< unsigned char > &I)
static void flush(const vpImage< unsigned char > &I)
static void displayRectangle(const vpImage< unsigned char > &I, const vpImagePoint &topLeft, unsigned int width, unsigned int height, const vpColor &color, bool fill=false, unsigned int thickness=1)
static void displayText(const vpImage< unsigned char > &I, const vpImagePoint &ip, const std::string &s, const vpColor &color)
error that can be emitted by ViSP classes.
static void convert(const vpImage< unsigned char > &src, vpImage< vpRGBa > &dest)
static void read(vpImage< unsigned char > &I, const std::string &filename, int backend=IO_DEFAULT_BACKEND)
Class that defines a 2D point in an image. This class is useful for image processing and stores only ...
Definition of the vpImage class member functions.
Defines a rectangle in the plane.
XML parser to load and save intrinsic camera parameters.
std::shared_ptr< vpDisplay > createDisplay()
Return a smart pointer vpDisplay specialization if a GUI library is available or nullptr otherwise.
vpDisplay * allocateDisplay()
Return a newly allocated vpDisplay specialization if a GUI library is available or nullptr otherwise.
VISP_EXPORT double measureTimeMs()
First we have to include the header corresponding to vpDetectorAprilTag class that allows detecting one or multiple tags.
#include <visp3/detection/vpDetectorAprilTag.h>
Then in the main() function before going further we need to check if ViSP was built with AprilTag 3rd party. We also check if ViSP is able to display images using either X11, or the Graphical Device Interface (GDI) under Windows, or OpenCV.
#if defined(VISP_HAVE_APRILTAG) && defined(VISP_HAVE_DISPLAY)
After reading the input image AprilTag.pgm and the creation of a display device in order to visualize the image, a vpDetectorAprilTag detector is constructed with the requested family tag.
Then we are applying some settings. There is especially vpDetectorAprilTag::setAprilTagQuadDecimate() function that could be used to decimate the input image in order to speed-up the detection.
detector.setAprilTagQuadDecimate(opt_tag_quad_decimate);
detector.setAprilTagPoseEstimationMethod(opt_tag_pose_estimation_method);
detector.setAprilTagNbThreads(opt_tag_nThreads);
detector.setZAlignedWithCameraAxis(opt_tag_z_align_frame);
detector.setAprilTagDecisionMarginThreshold(opt_tag_decision_margin_threshold);
detector.setAprilTagHammingDistanceThreshold(opt_tag_hamming_distance_threshold);
We are now ready to detect any 36h11 tags in the image. There is the vpDetectorAprilTag::detect(const vpImage<unsigned char> &) function that detects any tags in the image, but since here we want also to estimate the 3D pose of the tags, we call rather vpDetectorAprilTag::detect(const vpImage<unsigned char> &, const double, const vpCameraParameters &, std::vector<vpHomogeneousMatrix> &) that returns the pose of each tag as a vector of vpHomogeneousMatrix in cMo_vec variable.
std::vector<vpHomogeneousMatrix> cMo_vec;
detector.detect(I, opt_tag_size, cam, cMo_vec);
If one or more tags are detected, we can retrieve the number of detected tags in order to create a for loop over the tags.
std::cout << "\nDetected tags" << std::endl;
for (
size_t i = 0;
i < detector.getNbObjects();
i++) {
For each tag, we can then get the location of the 4 points that define the polygon that contains the tag and the corresponding bounding box.
std::vector<vpImagePoint>
p = detector.getPolygon(i);
vpRect bbox = detector.getBBox(i);
And finally, we are also able to get the tag family and id by calling vpDetectorAprilTag::getMessage() that will return a string with "<tag_family> id: <tag_id>" like "36h11 id: 19"
std::string message = detector.getMessage(i);
Parsing the message is then easy to retrive each tag id
std::size_t tag_id_pos = message.find("id: ");
if (tag_id_pos != std::string::npos) {
int tag_id = atoi(message.substr(tag_id_pos + 4).c_str());
ss.str("");
ss << "Tag id: " << tag_id;
}
There is also a way to get tag ids directly with
std::vector<int> tag_ids = detector.getTagsId();
This id can then be displayed
ss.str("");
ss <<
"Tag id: " << tag_ids[
i] <<
" - " << tag_decision_margins[
i];
To get a detection quality indicator you can use the decision margin value. The higher the value, the greater the confidence in detection.
std::vector<float> tag_decision_margins = detector.getTagsDecisionMargin();
Next in the code we display the 3D pose of each tag as a RGB frame.
for (
size_t i = 0;
i < cMo_vec.size();
i++) {
}
- Note
- To get absolute pose (not relative to a scale factor), you have to provide the real size of the marker (length of a marker side).
- To calibrate your camera, you can follow this tutorial: Tutorial: Camera intrinsic calibration.
3.2. Process live camera images
This other example also available in tutorial-apriltag-detector-live.cpp shows how to couple the AprilTag detector to an image grabber in order to detect tags on each new image acquired by a camera connected to your computer.
#include <iostream>
#include <visp3/core/vpConfig.h>
#if defined(VISP_HAVE_OPENCV) && (((VISP_HAVE_OPENCV_VERSION < 0x030000) && defined(HAVE_OPENCV_HIGHGUI)) || \
((VISP_HAVE_OPENCV_VERSION >= 0x030000) && defined(HAVE_OPENCV_VIDEOIO)))
#undef VISP_HAVE_V4L2
#undef VISP_HAVE_DC1394
#undef VISP_HAVE_CMU1394
#undef VISP_HAVE_FLYCAPTURE
#undef VISP_HAVE_REALSENSE2
#else
#undef HAVE_OPENCV_HIGHGUI
#undef HAVE_OPENCV_VIDEOIO
#endif
#if defined(VISP_HAVE_APRILTAG) && \
(defined(VISP_HAVE_V4L2) || defined(VISP_HAVE_DC1394) || defined(VISP_HAVE_CMU1394) || \
defined(VISP_HAVE_FLYCAPTURE) || defined(VISP_HAVE_REALSENSE2) || defined(VISP_HAVE_OPENCV) && \
(((VISP_HAVE_OPENCV_VERSION < 0x030000) && defined(HAVE_OPENCV_HIGHGUI)) || \
((VISP_HAVE_OPENCV_VERSION >= 0x030000) && defined(HAVE_OPENCV_VIDEOIO))))
#ifdef VISP_HAVE_MODULE_SENSOR
#include <visp3/sensor/vp1394CMUGrabber.h>
#include <visp3/sensor/vp1394TwoGrabber.h>
#include <visp3/sensor/vpFlyCaptureGrabber.h>
#include <visp3/sensor/vpRealSense2.h>
#include <visp3/sensor/vpV4l2Grabber.h>
#endif
#include <visp3/detection/vpDetectorAprilTag.h>
#include <visp3/core/vpXmlParserCamera.h>
#include <visp3/gui/vpDisplayFactory.h>
#if defined(VISP_HAVE_OPENCV) && (VISP_HAVE_OPENCV_VERSION < 0x030000) && defined(HAVE_OPENCV_HIGHGUI)
#include <opencv2/highgui/highgui.hpp>
#elif defined(VISP_HAVE_OPENCV) && (VISP_HAVE_OPENCV_VERSION >= 0x030000) && defined(HAVE_OPENCV_VIDEOIO)
#include <opencv2/videoio/videoio.hpp>
#endif
void usage(const char **argv, int error);
void usage(const char **argv, int error)
{
std::cout << "Synopsis" << std::endl
<< " " << argv[0]
<< " [--camera-device <id>]"
<< " [--tag-size <size>]"
<< " [--tag-family <family>]"
<< " [--tag-decision-margin-threshold <threshold>]"
<< " [--tag-hamming-distance-threshold <threshold>]"
<< " [--tag-z-aligned]"
<< " [--tag-quad-decimate <factor>]"
<< " [--tag-n-threads <number>]"
<< " [--tag-pose-method <method>]"
#if defined(VISP_HAVE_PUGIXML)
<< " [--intrinsic <xmlfile>]"
<< " [--camera-name <name>]"
#endif
#if defined(VISP_HAVE_DISPLAY)
<< " [--display-tag]"
<< " [--display-off]"
<< " [--color <id>]"
<< " [--thickness <thickness>"
#endif
<< " [--help, -h]" << std::endl
<< std::endl;
std::cout << "Description" << std::endl
<< " Compute the pose of an Apriltag in images acquired with a camera." << std::endl
<< std::endl
<< " --camera-device <id>" << std::endl
<< " Camera device id." << std::endl
<< " Default: 0" << std::endl
<< std::endl
<< " --tag-size <size>" << std::endl
<< " Apriltag size in [m]." << std::endl
<< " Default: 0.03" << std::endl
<< std::endl
<< " --tag-family <family>" << std::endl
<< " Apriltag family. Supported values are:" << std::endl
<< " 0: TAG_36h11" << std::endl
<< " 1: TAG_36h10 (DEPRECATED)" << std::endl
<< " 2: TAG_36ARTOOLKIT (DEPRECATED)" << std::endl
<< " 3: TAG_25h9" << std::endl
<< " 4: TAG_25h7 (DEPRECATED)" << std::endl
<< " 5: TAG_16h5" << std::endl
<< " 6: TAG_CIRCLE21h7" << std::endl
<< " 7: TAG_CIRCLE49h12" << std::endl
<< " 8: TAG_CUSTOM48h12" << std::endl
<< " 9: TAG_STANDARD41h12" << std::endl
<< " 10: TAG_STANDARD52h13" << std::endl
<< " 11: TAG_ARUCO_4x4_50" << std::endl
<< " 12: TAG_ARUCO_4x4_100" << std::endl
<< " 13: TAG_ARUCO_4x4_250" << std::endl
<< " 14: TAG_ARUCO_4x4_1000" << std::endl
<< " 15: TAG_ARUCO_5x5_50" << std::endl
<< " 16: TAG_ARUCO_5x5_100" << std::endl
<< " 17: TAG_ARUCO_5x5_250" << std::endl
<< " 18: TAG_ARUCO_5x5_1000" << std::endl
<< " 19: TAG_ARUCO_6x6_50" << std::endl
<< " 20: TAG_ARUCO_6x6_100" << std::endl
<< " 21: TAG_ARUCO_6x6_250" << std::endl
<< " 22: TAG_ARUCO_6x6_1000" << std::endl
<< " 23: TAG_ARUCO_7x7_50" << std::endl
<< " 24: TAG_ARUCO_7x7_100" << std::endl
<< " 25: TAG_ARUCO_7x7_250" << std::endl
<< " 26: TAG_ARUCO_7x7_1000" << std::endl
<< " 27: TAG_ARUCO_MIP_36h12" << std::endl
<< " Default: 0 (36h11)" << std::endl
<< std::endl
<< " --tag-decision-margin-threshold <threshold>" << std::endl
<< " Threshold used to discard low-confident detections. A typical value is " << std::endl
<< " around 100. The higher this value, the more false positives will be filtered" << std::endl
<< " out. When this value is set to -1, false positives are not filtered out." << std::endl
<< " Default: 50" << std::endl
<< std::endl
<< " --tag-hamming-distance-threshold <threshold>" << std::endl
<< " Threshold used to discard low-confident detections with corrected bits." << std::endl
<< " A typical value is between 0 and 3. The lower this value, the more false" << std::endl
<< " positives will be filtered out." << std::endl
<< " Default: 0" << std::endl
<< std::endl
<< " --tag-quad-decimate <factor>" << std::endl
<< " Decimation factor used to detect a tag. " << std::endl
<< " Default: 1" << std::endl
<< std::endl
<< " --tag-n-threads <number>" << std::endl
<< " Number of threads used to detect a tag." << std::endl
<< " Default: 1" << std::endl
<< std::endl
<< " --tag-z-aligned" << std::endl
<< " When enabled, tag z-axis and camera z-axis are aligned." << std::endl
<< " Default: false" << std::endl
<< std::endl
<< " --tag-pose-method <method>" << std::endl
<< " Algorithm used to compute the tag pose from its 4 corners." << std::endl
<< " Possible values are:" << std::endl
<< " 0: HOMOGRAPHY" << std::endl
<< " 1: HOMOGRAPHY_VIRTUAL_VS" << std::endl
<< " 2: DEMENTHON_VIRTUAL_VS" << std::endl
<< " 3: LAGRANGE_VIRTUAL_VS" << std::endl
<< " 4: BEST_RESIDUAL_VIRTUAL_VS" << std::endl
<< " 5: HOMOGRAPHY_ORTHOGONAL_ITERATION" << std::endl
<< " Default: 1 (HOMOGRAPHY_VIRTUAL_VS)" << std::endl
<< std::endl
#if defined(VISP_HAVE_PUGIXML)
<< " --intrinsic <xmlfile>" << std::endl
<< " Camera intrinsic parameters file in xml format." << std::endl
<< " Default: empty" << std::endl
<< std::endl
<< " --camera-name <name>" << std::endl
<< " Camera name in the intrinsic parameters file in xml format." << std::endl
<< " Default: empty" << std::endl
<< std::endl
#endif
#if defined(VISP_HAVE_DISPLAY)
<< " --display-tag" << std::endl
<< " Flag used to enable displaying the edges of a tag." << std::endl
<< " Default: disabled" << std::endl
<< std::endl
<< " --display-off" << std::endl
<< " Flag used to turn display off." << std::endl
<< " Default: enabled" << std::endl
<< std::endl
<< " --color <id>" << std::endl
<< " Color id used to display the frame over each tag." << std::endl
<< " Possible values are:" << std::endl
<< " -1: R-G-B colors for X, Y, Z axis respectively" << std::endl
<< " 0: all axis in black" << std::endl
<< " 1: all axis in white" << std::endl
<< " ..." << std::endl
<< " Default: -1" << std::endl
<< std::endl
<< " --thickness <thickness>" << std::endl
<< " Thickness of the drawings in overlay." << std::endl
<< " Default: 2" << std::endl
<< std::endl
#endif
<< " --verbose, -v" << std::endl
<< " Enable verbosity." << std::endl
<< std::endl
<< " --help, -h" << std::endl
<< " Print this helper message." << std::endl
<< std::endl;
if (error) {
std::cout << "Error" << std::endl
<< " "
<< "Unsupported parameter " << argv[error] << std::endl;
}
}
int main(int argc, const char **argv)
{
#ifdef ENABLE_VISP_NAMESPACE
#endif
int opt_device = 0;
double opt_tag_size = 0.053;
float opt_tag_quad_decimate = 1.0;
float opt_tag_decision_margin_threshold = 50;
int opt_tag_hamming_distance_threshold = 2;
int opt_tag_nThreads = 1;
std::string intrinsic_file = "";
std::string camera_name = "";
bool opt_display_tag = false;
int opt_color_id = -1;
unsigned int thickness = 2;
bool opt_tag_z_align_frame = false;
bool opt_verbose = false;
#if !(defined(VISP_HAVE_DISPLAY))
bool display_off = true;
std::cout << "Warning: There is no 3rd party (X11, GDI or openCV) to display images..." << std::endl;
#else
bool display_off = false;
#endif
for (int i = 1; i < argc; ++i) {
if (std::string(argv[i]) == "--camera-device" && i + 1 < argc) {
opt_device = atoi(argv[++i]);
}
else if (std::string(argv[i]) == "--tag-size" && i + 1 < argc) {
opt_tag_size = atof(argv[++i]);
}
else if (std::string(argv[i]) == "--tag-family" && i + 1 < argc) {
}
else if (std::string(argv[i]) == "--tag-quad-decimate" && i + 1 < argc) {
opt_tag_quad_decimate = static_cast<float>(atof(argv[++i]));
}
else if (std::string(argv[i]) == "--tag-n-threads" && i + 1 < argc) {
opt_tag_nThreads = atoi(argv[++i]);
}
else if (std::string(argv[i]) == "--tag-z-aligned") {
opt_tag_z_align_frame = true;
}
else if (std::string(argv[i]) == "--tag-pose-method" && i + 1 < argc) {
}
else if (std::string(argv[i]) == "--tag-decision-margin-threshold" && i + 1 < argc) {
opt_tag_decision_margin_threshold = static_cast<float>(atof(argv[++i]));
}
else if (std::string(argv[i]) == "--tag-hamming-distance-threshold" && i + 1 < argc) {
opt_tag_hamming_distance_threshold = atoi(argv[++i]);
}
#if defined(VISP_HAVE_PUGIXML)
else if (std::string(argv[i]) == "--intrinsic" && i + 1 < argc) {
intrinsic_file = std::string(argv[++i]);
}
else if (std::string(argv[i]) == "--camera-name" && i + 1 < argc) {
}
#endif
#if defined(VISP_HAVE_DISPLAY)
else if (std::string(argv[i]) == "--display-tag") {
opt_display_tag = true;
}
else if (std::string(argv[i]) == "--display-off") {
display_off = true;
}
else if (std::string(argv[i]) == "--color" && i + 1 < argc) {
opt_color_id = atoi(argv[++i]);
}
else if (std::string(argv[i]) == "--thickness" && i + 1 < argc) {
thickness =
static_cast<unsigned int>(atoi(argv[++i]));
}
#endif
else if (std::string(argv[i]) == "--verbose" || std::string(argv[i]) == "-v") {
opt_verbose = true;
}
else if (std::string(argv[i]) == "--help" || std::string(argv[i]) == "-h") {
usage(argv, 0);
return EXIT_SUCCESS;
}
else {
usage(argv, i);
return EXIT_FAILURE;
}
}
#if (VISP_CXX_STANDARD >= VISP_CXX_STANDARD_11)
std::shared_ptr<vpDisplay> display;
#else
#endif
try {
cam.initPersProjWithoutDistortion(615.1674805, 615.1675415, 312.1889954, 243.4373779);
#if defined(VISP_HAVE_PUGIXML)
#endif
#if defined(VISP_HAVE_V4L2)
std::ostringstream device;
device << "/dev/video" << opt_device;
std::cout << "Use Video 4 Linux grabber on device " << device.str() << std::endl;
#elif defined(VISP_HAVE_DC1394)
(void)opt_device;
std::cout << "Use DC1394 grabber" << std::endl;
#elif defined(VISP_HAVE_CMU1394)
(void)opt_device;
std::cout << "Use CMU1394 grabber" << std::endl;
#elif defined(VISP_HAVE_FLYCAPTURE)
(void)opt_device;
std::cout << "Use FlyCapture grabber" << std::endl;
#elif defined(VISP_HAVE_REALSENSE2)
(void)opt_device;
std::cout << "Use Realsense 2 grabber" << std::endl;
config.disable_stream(RS2_STREAM_DEPTH);
config.disable_stream(RS2_STREAM_INFRARED);
config.enable_stream(RS2_STREAM_COLOR, 640, 480, RS2_FORMAT_RGBA8, 30);
std::cout << "Read camera parameters from Realsense device" << std::endl;
#elif defined(VISP_HAVE_OPENCV) && \
(((VISP_HAVE_OPENCV_VERSION < 0x030000) && defined(HAVE_OPENCV_HIGHGUI)) || \
((VISP_HAVE_OPENCV_VERSION >= 0x030000) && defined(HAVE_OPENCV_VIDEOIO)))
std::cout << "Use OpenCV grabber on device " << opt_device << std::endl;
cv::VideoCapture g(opt_device);
if (!g.isOpened()) {
std::cout << "Failed to open the camera" << std::endl;
return EXIT_FAILURE;
}
cv::Mat frame;
g >> frame;
#endif
std::cout <<
cam << std::endl;
std::cout << "Tag detector settings" << std::endl;
std::cout << " Tag size [m] : " << opt_tag_size << std::endl;
std::cout << " Tag family : " << opt_tag_family << std::endl;
std::cout << " Quad decimate : " << opt_tag_quad_decimate << std::endl;
std::cout << " Decision margin threshold : " << opt_tag_decision_margin_threshold << std::endl;
std::cout << " Hamming distance threshold: " << opt_tag_hamming_distance_threshold << std::endl;
std::cout << " Num threads : " << opt_tag_nThreads << std::endl;
std::cout << " Z aligned : " << opt_tag_z_align_frame << std::endl;
std::cout << " Pose estimation : " << opt_tag_pose_estimation_method << std::endl;
if (!display_off) {
#if (VISP_CXX_STANDARD >= VISP_CXX_STANDARD_11)
#else
#endif
}
detector.setAprilTagQuadDecimate(opt_tag_quad_decimate);
detector.setAprilTagPoseEstimationMethod(opt_tag_pose_estimation_method);
detector.setAprilTagNbThreads(opt_tag_nThreads);
detector.setZAlignedWithCameraAxis(opt_tag_z_align_frame);
detector.setAprilTagDecisionMarginThreshold(opt_tag_decision_margin_threshold);
detector.setAprilTagHammingDistanceThreshold(opt_tag_hamming_distance_threshold);
std::vector<double> time_vec;
bool quit = false;
while (!quit) {
#if defined(VISP_HAVE_V4L2) || defined(VISP_HAVE_DC1394) || defined(VISP_HAVE_CMU1394) || \
defined(VISP_HAVE_FLYCAPTURE) || defined(VISP_HAVE_REALSENSE2)
#elif defined(VISP_HAVE_OPENCV) && \
(((VISP_HAVE_OPENCV_VERSION < 0x030000) && defined(HAVE_OPENCV_HIGHGUI)) || \
((VISP_HAVE_OPENCV_VERSION >= 0x030000) && defined(HAVE_OPENCV_VIDEOIO)))
g >> frame;
#endif
if (opt_verbose) {
std::cout << "-- Process new image --" << std::endl;
}
std::vector<vpHomogeneousMatrix> cMo_vec;
detector.detect(I, opt_tag_size, cam, cMo_vec);
time_vec.push_back(t);
std::stringstream ss;
ss <<
"Detection time: " <<
t <<
" ms for " << detector.getNbObjects() <<
" tags";
for (
size_t i = 0;
i < cMo_vec.size(); ++
i) {
}
std::vector< std::vector<vpImagePoint> > tags_corners = detector.getTagsCorners();
detector.displayTags(I, tags_corners);
std::vector<int> tags_id = detector.getTagsId();
for (
size_t i = 0;
i < tags_id.size(); ++
i) {
std::stringstream ss;
ss <<
"id=" << tags_id[
i];
}
quit = true;
}
if (opt_verbose) {
std::vector<float> tag_decision_margins = detector.getTagsDecisionMargin();
std::vector<int> tag_hamming_distances = detector.getTagsHammingDistance();
for (
size_t i = 0;
i < tags_id.size(); ++
i) {
std::string message = detector.getMessage(i);
std::stringstream ss;
ss << "Found " << message << std::endl
<<
"- with decision margin: " << tag_decision_margins[
i]
<<
" and hamming distance: " << tag_hamming_distances[
i] << std::endl
<<
"- and cMo:\n" << cMo_vec[
i] << std::endl;
std::cout << ss.str() << std::endl;
}
}
}
std::cout << "Benchmark computation time" << std::endl;
}
std::cerr <<
"Catch an exception: " <<
e.getMessage() << std::endl;
}
#if (VISP_CXX_STANDARD < VISP_CXX_STANDARD_11)
if (display != nullptr) {
delete display;
}
#endif
return EXIT_SUCCESS;
}
#else
int main()
{
#ifndef VISP_HAVE_APRILTAG
std::cout << "Enable Apriltag support, configure and build ViSP to run this tutorial" << std::endl;
#else
std::cout << "Install a 3rd party dedicated to frame grabbing (dc1394, cmu1394, v4l2, OpenCV, FlyCapture, "
<< "Realsense2), configure and build ViSP again to use this example"
<< std::endl;
#endif
return EXIT_SUCCESS;
}
#endif
Firewire cameras video capture based on CMU 1394 Digital Camera SDK.
void open(vpImage< unsigned char > &I)
Class for firewire ieee1394 video devices using libdc1394-2.x api.
void open(vpImage< unsigned char > &I)
void open(vpImage< unsigned char > &I)
static double getMedian(const std::vector< double > &v)
static double getStdev(const std::vector< double > &v, bool useBesselCorrection=false)
static double getMean(const std::vector< double > &v)
vpCameraParameters getCameraParameters(const rs2_stream &stream, vpCameraParameters::vpCameraParametersProjType type=vpCameraParameters::perspectiveProjWithDistortion, int index=-1) const
void acquire(vpImage< unsigned char > &grey, double *ts=nullptr)
bool open(const rs2::config &cfg=rs2::config())
Class that is a wrapper over the Video4Linux2 (V4L2) driver.
void open(vpImage< unsigned char > &I)
void setScale(unsigned scale=vpV4l2Grabber::DEFAULT_SCALE)
void setDevice(const std::string &devname)
The usage of this example is similar to the previous one:
- with option
--tag-family you select the kind of tag that you want to detect.
- if more than one camera is connected to you computer, with option
--input you can select which camera to use. The first camera that is found has number 0.
To detect 36h11 tags on images acquired by a second camera connected to your computer use:
$ ./tutorial-apriltag-detector-live --tag-family 0 --input 1
The source code of this example is very similar to the previous one except that here we use camera framegrabber devices (see Tutorial: Image frame grabbing). Two different grabber may be used:
- If ViSP was built with Video For Linux (V4L2) support available for example on Fedora or Ubuntu distribution, VISP_HAVE_V4L2 macro is defined. In that case, images coming from an USB camera are acquired using vpV4l2Grabber class.
- If ViSP wasn't built with V4L2 support but with OpenCV, we use cv::VideoCapture class to grab the images. Notice that when images are acquired with OpenCV there is an additional conversion from cv::Mat to vpImage.
#if defined(VISP_HAVE_V4L2)
std::ostringstream device;
device << "/dev/video" << opt_device;
std::cout << "Use Video 4 Linux grabber on device " << device.str() << std::endl;
#elif defined(VISP_HAVE_DC1394)
(void)opt_device;
std::cout << "Use DC1394 grabber" << std::endl;
#elif defined(VISP_HAVE_CMU1394)
(void)opt_device;
std::cout << "Use CMU1394 grabber" << std::endl;
#elif defined(VISP_HAVE_FLYCAPTURE)
(void)opt_device;
std::cout << "Use FlyCapture grabber" << std::endl;
#elif defined(VISP_HAVE_REALSENSE2)
(void)opt_device;
std::cout << "Use Realsense 2 grabber" << std::endl;
config.disable_stream(RS2_STREAM_DEPTH);
config.disable_stream(RS2_STREAM_INFRARED);
config.enable_stream(RS2_STREAM_COLOR, 640, 480, RS2_FORMAT_RGBA8, 30);
std::cout << "Read camera parameters from Realsense device" << std::endl;
#elif defined(VISP_HAVE_OPENCV) && \
(((VISP_HAVE_OPENCV_VERSION < 0x030000) && defined(HAVE_OPENCV_HIGHGUI)) || \
((VISP_HAVE_OPENCV_VERSION >= 0x030000) && defined(HAVE_OPENCV_VIDEOIO)))
std::cout << "Use OpenCV grabber on device " << opt_device << std::endl;
cv::VideoCapture g(opt_device);
if (!g.isOpened()) {
std::cout << "Failed to open the camera" << std::endl;
return EXIT_FAILURE;
}
cv::Mat frame;
g >> frame;
#endif
Then in the while loop, at each iteration we acquire a new image
#if defined(VISP_HAVE_V4L2) || defined(VISP_HAVE_DC1394) || defined(VISP_HAVE_CMU1394) || \
defined(VISP_HAVE_FLYCAPTURE) || defined(VISP_HAVE_REALSENSE2)
#elif defined(VISP_HAVE_OPENCV) && \
(((VISP_HAVE_OPENCV_VERSION < 0x030000) && defined(HAVE_OPENCV_HIGHGUI)) || \
((VISP_HAVE_OPENCV_VERSION >= 0x030000) && defined(HAVE_OPENCV_VIDEOIO)))
g >> frame;
#endif
This new image is then given as input to the AprilTag detector.
4. Tips & Tricks
4.1. Filter out false positive detections
The algorithm behind the tag detection is able to compute a hamming distance and a decision margin factor for each tag.
- The hamming distance corresponds to the number of corrected bits associated to each detected tag. Retaining only tags for which the hamming distance is equal to 0 is a first way to filter out tags for which we are not confident in the detection. To get the hamming distance for all the detected tags, you may use vpDetectorAprilTag::getTagsHammingDistance(). To specify the hamming distance threshold used to filter out tags for which the hamming distance is higher than the threshold use vpDetectorAprilTag::setAprilTagHammingDistanceThreshold(). By default the hamming distance threshold is set to 2, meaning that we accept tags with 2 bit-errors which is very conservative.
- The decision margin factor is a measure of the quality of the detection. Each detected tag has its own decision margin factor. To get this value for each detected tags, use vpDetectorAprilTag::getTagsDecisionMargin(). You can filter out tags that have a decision margin factor lower than a decision margin threshold using vpDetectorAprilTag::setAprilTagDecisionMarginThreshold(). Higher is the decision margin threshold, more you will filter out false positive detections. By default the decision margin threshold is set to -1, meaning that the decision margin is not used to filter out false positives.
When running tutorial-apriltag-detector binary corresponding to tutorial-apriltag-detector.cpp you will get by default:
$ cd ${VISP_WS}/tutorial/detection/tag
$ ./tutorial-apriltag-detector
Input data
Image : AprilTag.jpg
Intrinsics : default
Camera parameters for perspective projection without distortion:
px = 615.167 py = 615.168
u0 = 312.189 v0 = 243.437
Tag detector settings
Tag size [m] : 0.053
Tag family : 36h11
Quad decimate : 1
Decision margin threshold : -1
Hamming distance threshold: 0
Num threads : 1
Z aligned : 0
Pose estimation : HOMOGRAPHY_VIRTUAL_VS
Detected tags
36h11 id: 8 with decision margin: 75.8086 and hamming distance: 0
36h11 id: 9 with decision margin: 85.3353 and hamming distance: 0
36h11 id: 10 with decision margin: 100.975 and hamming distance: 0
36h11 id: 11 with decision margin: 73.9264 and hamming distance: 0
36h11 id: 12 with decision margin: 100.111 and hamming distance: 0
36h11 id: 13 with decision margin: 108.233 and hamming distance: 0
36h11 id: 14 with decision margin: 100.964 and hamming distance: 0
36h11 id: 15 with decision margin: 96.4312 and hamming distance: 0
36h11 id: 16 with decision margin: 99.0806 and hamming distance: 0
36h11 id: 17 with decision margin: 118.783 and hamming distance: 0
36h11 id: 18 with decision margin: 115.397 and hamming distance: 0
36h11 id: 19 with decision margin: 81.0268 and hamming distance: 0
Now if you set the decision margin threshold to 100, you can see that less tags will be detected. Will remain those with a decision margin greater then 100:
$ ./tutorial-apriltag-detector --tag-decision-margin-threshold 100
Input data
Image : AprilTag.jpg
Intrinsics : default
Camera parameters for perspective projection without distortion:
px = 615.167 py = 615.168
u0 = 312.189 v0 = 243.437
Tag detector settings
Tag size [m] : 0.053
Tag family : 36h11
Quad decimate : 1
Decision margin threshold : 100
Hamming distance threshold: 0
Num threads : 1
Z aligned : 0
Pose estimation : HOMOGRAPHY_VIRTUAL_VS
Detected tags
36h11 id: 10 with decision margin: 100.975 and hamming distance: 0
36h11 id: 12 with decision margin: 100.111 and hamming distance: 0
36h11 id: 13 with decision margin: 108.233 and hamming distance: 0
36h11 id: 14 with decision margin: 100.964 and hamming distance: 0
36h11 id: 17 with decision margin: 118.783 and hamming distance: 0
36h11 id: 18 with decision margin: 115.397 and hamming distance: 0
- Note
- Some threads about the detection margin:
4.2. Marker detection debugging
You can use the vpDetectorAprilTag::setAprilTagDebugOption() method to enable the writing of some debugging images in the same folder than the executable. Some overviews of the generated debugging images:
Top left: adaptive thresholding, Top middle: segmentation process, Top right: contours clustering, Bottom left: quadrectangles extraction, Bottom center: Tag code decoding, Bottom right: output detections.
This would allow you to understand the AprilTag detection process and why a marker is not detected or why a false detection occurs.
Some example threads:
4.3. Improve pose accuracy using a RGB-D camera
When the tag is small in the image or when the tag corners location are extracted poorly, you may experience z-axis flipping if you analyze carefully the pose of the tag. The following images illustrate this behavior.
These are 3 successive images acquired by the Realsense D435 color camera with tag pose in overlay. In the image in the middle, you can see the z-axis flipping phenomena where z-axis in blue is not oriented as expected.
The pose is computed from the 4 tag corners location, assuming a planar object, and this behavior is inherent to the planar pose estimation ambiguity, see Dementhon.
To lift this ambiguity, we propose to use the depth map of a RGB-D sensor in order to bring additional 3D information. The pose estimation can then be seen as a 3D-3D optimization process implemented in vpPose::computePlanarObjectPoseFromRGBD(). As shown in the following images, using this method with a Realsense D435 sensor allows to overcome this ambiguity.
These are the same 3 successive images acquired by the Realsense color camera with resulting pose in overlay, but here we used depth map aligned with color image given as input to vpPose::computePlanarObjectPoseFromRGBD(). As you can see in the 3 images z-axis flipping phenomena doesn't occur any more.
An example that shows how to use this function is given in tutorial-apriltag-detector-live-rgbd-realsense.cpp. In this example we are using a Realsense D435 or equivalent RGB-D sensor, but it could be adapted to any other RGB-D sensor as long as you can align depth map and color image.
std::vector<std::vector<vpImagePoint> > tags_corners = detector.getPolygon();
std::vector<int> tags_id = detector.getTagsId();
std::map<int, double> tags_size;
tags_size[-1] = opt_tag_size;
std::vector<std::vector<vpPoint> > tags_points3d = detector.getTagsPoints3D(tags_id, tags_size);
for (
size_t i = 0;
i < tags_corners.size();
i++) {
double confidence_index;
&confidence_index)) {
if (confidence_index > 0.5) {
}
else if (confidence_index > 0.25) {
}
else {
}
std::stringstream ss;
ss <<
"Tag id " << tags_id[
i] <<
" confidence: " << confidence_index;
if (opt_verbose) {
std::cout << ss.str() << std::endl;
std::cout <<
"cMo[" <<
i <<
"]: \n" << cMo_vec[
i] << std::endl;
std::cout <<
"cMo[" <<
i <<
"] using depth: \n" <<
cMo << std::endl;
}
}
else {
}
}
Another example using a Structure Core RGB-D sensor is given in tutorial-apriltag-detector-live-rgbd-structure-core.cpp.
6. Next tutorial
You are now ready to see the Tutorial: Bar code detection, that illustrates how to detect QR codes in an image.