Tutorial: Eye-to-hand IBVS with Panda 7-dof robot from Franka Emika
1. Introduction
This tutorial explains how to achieve an eye-to-hand image-based visual servoing (IBVS) with Franka Emika's Panda 7-dof robot, with a fixed Realsense camera mounted on a tripod and observing the robot's end-effector, to which we've attached an Apriltag.
Panda robot in eye-to-hand configuration.
The proposed image-based visual servoing will consider the four 2D image point visual features corresponding to the corners of the Apritag.
Let us note:
the robot Jacobian in the end-effector frame relative to the end-effector frame
the 6-by-6 velocity twist matrix corresponding to the camera to end-effector frames transformation that allows to transform the velocity from point e in end-effector frame to point c in camera frame
the 6-by-6 velocity twist matrix corresponding to the camera to robot reference frames transformation that allows to transform the velocity from point f in robot reference frame to point c in camera frame
the 6-by-6 velocity twist matrix corresponding to the robot reference to robot end-effector frames transformation that allows to transform the velocity from point e in robot end-effector frame to point f in robot reference frame
current image point visual features vector
desired current image point visual features vector
the interaction matrix corresponding to image point visual features that can be the current one updated at each iteration of the loop , or the desired one
is the gain of the controller
the joint velocities to apply tho the robot to achieve the eye-to-hand visual servoing.
In this tutorial we will consider two different eye-to-hand control laws:
Here to build there is the need to compute where is the pose of the object in the camera frame and constant transformation is obtained by extrinsic calibration.
This control law is recommended since it doesn't need to compute the pose of the object . Here we will build thanks to constant transformation obtained by extrinsic calibration, while is here build thanks to given by the robot SDK.
Unlike previous control laws, this control law doesn't need to compute the pose of the object . Here we will build thanks to constant transformation obtained by extrinsic calibration, while is here build thanks to given by the robot odometry.
We suppose here that you have:
a Panda robot in its research version from Franka Emika that will be controlled throw vpRobotFranka class.
an Intel Realsense D345 camera mounted on a tripod. Images will be acquired thanks to vpRealSense2 class. Note that this tutorial should also work with any other Intel Realsense camera.
Even though the Realsense camera can retrieve intrinsic factory parameters, we recommend calibrating your camera to obtain more accurate results. To this end follow Tutorial: Camera intrinsic calibration.
Results are saved in "${VISP_WS}/visp-build/example/servo-franka/data-calib/franka_camera_visp.xml"
2.3. Calibrate camera extrinsics
To achieve an eye-to-hand visual servoing depending on the visual-servoing scheme, there is the need to know the transformation between the robot reference and the camera frames or the transformation between the robot end-effector and the object (in our case the Apriltag) frames.
The first step is to aquire images of your object, the Apriltag, for different robot end-effector positions. Usually we acquire between 6 and 10 images. By default the robot controller IP is 192.168.1.1. If your Franka has an other IP (let say 192.168.30.10) use "--ip" option like:
$ cd ${VISP_WS}/visp-build/example/servo-franka/data-calib/
It records the following outputs in "${VISP_WS}/visp-build/example/servo-franka/data-calib/" folder:
franka_camera.xml: XML file that contains the intrinsic camera parameters extracted from camera firmware
couples of franka_image-<number>.png + franka_pose_rPe-<number>.txt with number starting from 1. franka_pose_rPe-<number>.yaml is the pose of the end-effector expressed in the robot reference frame , while franka_image-<number>.png is the image captured at the corresponding robot position.
Given the camera intrinsic parameters estimated previously and the set of images, compute now the tag poses running (adapt the tag size parameter to your use case):
$ cd ${VISP_WS}/visp-build/example/servo-franka/data-calib/
Extrinsic parameters will be saved in "${VISP_WS}/visp-build/example/servo-franka/data-calib/" folder where you will find
franka_rPc.yaml file that contains the transformation between the robot reference and the camera frames
franka_ePo.yaml file that contains the transformation between the end-effector and the apriltag frames.
2.4. Learn desired visual features
To ensure that the visual servo loop converges to a desired position within the robot's joint limits, the desired visual features are learned. More precisely, for each tag corner we will save . Since we are using a 0.048 m large Apriltag we will run servoFrankaIBVS-EyeToHand-Lcur_cVe_eJe binary corresponding to servoFrankaIBVS-EyeToHand-Lcur_cVe_eJe.cpp source code.
Note
You can also use the servoFrankaIBVS-EyeToHand-Ldes_cVf_fVe_eJe binary corresponding to servoFrankaIBVS-EyeToHand-Ldes_cVf_fVe_eJe source code with the same command line options.
Warning
In our case, we are using an Apriltag 0.048 m wide. Adapt the value of parameter ”--tag-size” to the dimensions of your tag.
$ cd ${VISP_WS}/visp-build/example/servo-franka/
$ ./servoFrankaIBVS-EyeToHand-Lcur_cVe_eJe \
--learn-desired-features \
--intrinsic data-calib/franka_camera_visp.xml \
--camera-name Camera \
--tag-size 0.048
It will produce the following printings
Parameters:
Apriltag
Size [m] : 0.048
Z aligned : false
Camera intrinsics
Factory parameters : no
Param file name [.xml]: data-calib/franka_camera.xml
Camera name : Camera
Found camera with name: "Camera"
Camera parameters used to compute the pose:
Camera parameters for perspective projection with distortion:
px = 601.0670294 py = 602.4186662
u0 = 321.3921431 v0 = 236.8025954
kud = 0.05663697965
kdu = -0.05584121348
Move the robot arm to a desired position in which the Apriltag is visible like the following
Save the corresponding visual features with a left-click in the image, and then use a right-click to quit the app.
Desired visual features saved in: learned_desired_features.txt
The content of the "learned_desired_features.txt" file looks like the following in which we have the values for each tag corner.
$ cat learned_desired_features.txt
0.243153 -0.152012 0.155126
-0.0636572 -0.146852 0.158983
-0.0669343 0.158418 0.153581
0.250862 0.161218 0.149723
2.5. Summary of prerequisite data
At this point in "${VISP_WS}/visp-build/example/servo-franka/" folder you should have the following data files
Visual servoing will start and converge to the desired features like in the following video:
Here you can notice the straight line trajectories of the image points.
The following video shows the corresponding trajectory of the robot:
Code explanation
If you take a look to the source code, you will see that we create a desired feature point vector p_d updated by reading the values of the four points from "learned_desired_features.txt" file
std::vector<vpFeaturePoint> p_d; // Desired visual features
// Sanity options check
if (desired_features_filename.empty() || (!vpIoTools::checkFilename(desired_features_filename))) {
std::cout << "Cannot start eye-to-hand visual-servoing. Desired features are not available." << std::endl;
std::cout << "use --learn-desired-features flag to learn the desired features." << std::endl;
return EXIT_FAILURE;
}
else {
if (!read_desired_features(desired_features_filename, p_d)) {
std::cout << "Error: Unable to read desired features from: " << desired_features_filename << std::endl;
return EXIT_FAILURE;
}
}
Then we create the servo task by adding the four current and desired visual feature points
Thus in the servo while loop the interaction matrix will be computed at each iteration of the servo loop. To this end we need to update for each of the 4 points. For updating coordinate of each point there is the need to compute the pose of the object.
bool ret = detector.detect(I, opt_tag_size, cam, c_M_o_vec);
Here the tag pose is available in c_M_o_vec[0] as an homogeneous matrix
p[i].set_Z(c_P[2]); // Update Z value of each visual feature thanks to the tag pose
}
This pose is here also used to compute where constant homogeneous matrix is given by extrinsic calibration and read thanks to "--eMo" command line option.
That's why we need to update for each of the 4 feature points. For updating coordinate there is the need to compute the pose of the object.
Visual servoing will start and converge to the desired features like in the following video:
Here you can also notice the straight line trajectories of the image points.
3.3 Using Ldes_cVf_fVe_eJe controller
To use rather "Ldes_cVf_fVe_eJe" controller implemented in servoFrankaIBVS-EyeToHand-Ldes_cVf_fVe_eJe, you may run:
$ cd ${VISP_WS}/visp-build/example/servo-franka/
$ ./servoFrankaIBVS-EyeToHand-Ldes_cVf_fVe_eJe \
--ip 192.168.30.10 \
--intrinsic data-calib/franka_camera_visp.xml \
--rMc data-calib/franka_rPc.yaml \
--camera-name Camera \
--tag-size 0.048 \
--no-convergence-threshold
As explained in 1. Introduction, with this controller you don't need to estimate the pose of the object. That's why in the source code we will use the interaction matrix at the desired position by calling:
Visual servoing will start and converge to the desired features like in the following video. As the interaction matrix is computed with the desired visual features, you may notice that the trajectories of the points in the image are no longer straight lines.
Note that with this controller, there's no guarantee that the 4 points used as visual features will remain visible in the image.