1

I have a robot with a Intel RealSense D435 (a depth camera) and want to estimate the position of an detected object. However, the cam is rotated at a 30° angle. When I detect an object, I can calculate the distance from the camera to the object with this math:

def findPixelCoords(self, x: float, y: float, d: Image) -> Point:
        coords: Point = Point()

        invFx = 1/self.Fx
        invFy = 1/self.Fy

        coords.x = d[y][x] * 0.001
        coords.y = ((x - self.Cx) * coords.x * invFx) + 0.037 #0.037 is the distance between the depth module and the center of the cam
        coords.z = (y - self.Cy) * coords.x * invFy
        
        return coords

Where x is the depth/forward axis, y the horizontal axis and z the height. The problem is that the camera thinks it is straight and the world rotated, like here: enter image description here

So we want to rotate the the x and z coords (not the y value because the rotation is on the y axis) to get the real x and z coords in the real world. With this math:

newX = oldX * cos(theta) + oldZ * sin(theta)
newZ = -oldX * sin(theta) + oldZ * cos(theta)

Where oldX and oldZ are the original x and z values and theta is the camera rotation in radians. We get this: enter image description here

With A being the cam (in the origin), B the unrotated position of the object and B' the rotated position. The thing is, the the x value of B' is correct, but because the z (which in this cartesian plane is the y value) is dependent on the x value, if the object is close, the math says the object is high up, and if it is far away, the math says the object is very low, but the heights of the camera and the object are constant. So we have this situation: enter image description here

Where we know all the values except for newZ. I tried Pytagoras but the value it gives is absurd. The height (in the case I was testing) is about 37cm and it gives me about 60cm. I also cant just say the z value is 37cm because the object can be at various different heights.

2
  • in general, you should look for a textbook or course on computer vision. everyone uses matrices to express all the transformations, because that is infinitely more convenient than handling the coordinates individually and "writing it out" as you do. Commented Nov 3, 2024 at 22:52
  • I'm not sure I understand the geometric model. Your first hand drawing seems to suggest that the object is always on the central axis of the camera (which is x axis in your setting) and parallel to the non-rotated x and z axes, see this picture which attempts to draw the same thing in the "absolute" reference frame, not that of the camera. But that can't be the case, because in that drawing (and mine) the center of rotation is not the position of the camera, but the center of the object, and then what are we computing? Please enlighten us. Commented Nov 4, 2024 at 2:19

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.