2D image point to 3D

Question

Rosanswers logo

Hey!

I have a setup with a bottom camera (simple webcam pointed at the ground). Along with the camera is an IMU, which means that I have the attitude of the camera. I also have an estimation of the distance from the camera to the ground.

Since I'm detecting some target on the ground (I'm able to detect and get their position on the image using openCV), I would like to extract their position on the world frame. However I'm lost on how to do it.

Are there any ROS package that implement this? How should I do it?

Thanks in advance!

Originally posted by nwanda on ROS Answers with karma: 47 on 2015-07-20

Post score: 2

Airuno2L · Accepted Answer · 2015-07-21 07:21:08Z

Rosanswers logo

It sounds like you have the 3D point that describes the location of the camera, and since you have the IMU with it you know it's orientation as well. Since you also know the estimated location of the ground plane this can be treated as a ray-plane intersection problem, which is a common problem in computer graphics that solves the x,y,z intersection point of a ray and a plane.

The line that starts from the camera and points through the ground is the ray, and the ground or course is the plane. Since the object you're selecting/detecting from the camera image is not always in the center pixel, you'll need to add a pan and tilt angle to the ray depending on which pixel the center of the object corresponds to, the image_geometry package has tools to help do just that.

As mig mentioned, you'll want to use tf to help keep track of all the transformation frames, and a urdf can make this even easier.

Once you know the ray, google "line-plane intersection". There is even a Wikipedia article about it to get started. You might even get lucky searching for "how to do line-plane intersection in c++" or python or however you want to do it, and you might find some ready to use code.

Originally posted by Airuno2L with karma: 3460 on 2015-07-21

This answer was ACCEPTED on the original site

Post score: 3

Original comments

Comment by mgruhler on 2015-07-21:
good points there!

Comment by nwanda on 2015-07-22:
That's actually what I had in mind, using the plane z=0 (the ground). However I'm unsure how to obtain the ray, since I only have a 3x3 matrix of intrinsic parameters (camera matrix). And how should I get the scale factor? I'm a little bit lost.

Comment by nwanda on 2015-07-22:
After reading about image_geometry, I think I know how to proceed. Using projectPixelTo3dRay I'm able to obtain the ray and after the intersection with the ground plane I should be getting a x,y,z for the target. right? Btw, which library is usually used for linear algebra computation on ros?

Comment by Airuno2L on 2015-07-22:
That's exactly right. The tf library has some built in tools for linear algebra. Here is a small tutorial that shows python use. TF actually uses transformations.py

Comment by Airuno2L on 2015-07-22:
I'm not sure what people do when using C++, Looking at that tutorial TF::Transform is just a btTransform from bullet, but there is nothing stopping you from using other ways. Eigen is good to.

Comment by nwanda on 2015-07-22:
Thanks for all the help! I will look into the eigen library.

Matthias · Accepted Answer · 2015-07-21 01:07:08Z

Rosanswers logo

You're saying you have the position of the camera in 3D already, right?

Then, you would have to write this yourself, but this is fairly easy with the tf library (best read through the documentation and the tutorials). However, you need to have your camera and IMU set up correctly in the urdf. With the position of the target in the camera frame (i.e. x,y,z with respect to the camera), you can call tranformPoint (see documentation here) to transform the Point from the camera frame into any other frame you have avaiable (e.g. the base_link frame of your robot or any map or world frame you have set up).

Originally posted by mgruhler with karma: 12390 on 2015-07-21

This answer was NOT ACCEPTED on the original site

Post score: 1

Original comments

Comment by nwanda on 2015-07-21:
Do i need to use the urdf model? Is it not enough to define a static tf between the camera and IMU? How should I get the x,y,z with respect to the camera? I only have the x,y in the image, and camera_calibration only outputs the intrinsic parameters (from what I can tell).

Comment by mgruhler on 2015-07-21:
Static tf is fine. Just assumed you'd have a robot... You say you have an estimate of the distance from the camera to the ground --> z. Otherwise, from a monocular camera you cannot tell the distance to an object (as long as you don't know the exact parameters of the the object and estimate it).

Comment by Jägermeister on 2019-02-08:
I couldn't find any relevant tutorial to this question over there, could you pinpoint exactly which one is referring to 2d-3d coordinate conversion?

Stack Exchange Network

2D image point to 3D

2 Answers 2

Original comments

Original comments

Hot Network Questions

2D image point to 3D

2 Answers 2

Original comments

Original comments

Related

Hot Network Questions