Triangulation (computer vision)

In computer vision triangulation refers to the process of determining a point in 3D space given its projections onto two, or more, images. In order to solve this problem it is necessary to know the parameters of the camera projection function from 3D to 2D for the cameras involved, in the simplest case represented by the camera matrices. Triangulation is sometimes also referred to as reconstruction. In computer vision triangulation refers to the process of determining a point in 3D space given its projections onto two, or more, images. In order to solve this problem it is necessary to know the parameters of the camera projection function from 3D to 2D for the cameras involved, in the simplest case represented by the camera matrices. Triangulation is sometimes also referred to as reconstruction. The triangulation problem is in theory trivial. Since each point in an image corresponds to a line in 3D space, all points on the line in 3D are projected to the point in the image. If a pair of corresponding points in two, or more images, can be found it must be the case that they are the projection of a common 3D point x. The set of lines generated by the image points must intersect at x (3D point) and the algebraic formulation of the coordinates of x (3D point) can be computed in a variety of ways, as is presented below. In practice, however, the coordinates of image points cannot be measured with arbitrary accuracy. Instead, various types of noise, such as geometric noise from lens distortion or interest point detection error, lead to inaccuracies in the measured image coordinates. As a consequence, the lines generated by the corresponding image points do not always intersect in 3D space. The problem, then, is to find a 3D point which optimally fits the measured image points. In the literature there are multiple proposals for how to define optimality and how to find the optimal 3D point. Since they are based on different optimality criteria, the various methods produce different estimates of the 3D point x when noise is involved. In the following, it is assumed that triangulation is made on corresponding image points from two views generated by pinhole cameras. Generalization from these assumptions are discussed here. The image to the left illustrates the epipolar geometry of a pair of stereo cameras of pinhole model. A point x (3D point) in 3D space is projected onto the respective image plane along a line (green) which goes through the camera's focal point, O 1 {displaystyle mathbf {O} _{1}} and O 2 {displaystyle mathbf {O} _{2}} , resulting in the two corresponding image points y 1 {displaystyle mathbf {y} _{1}} and y 2 {displaystyle mathbf {y} _{2}} . If y 1 {displaystyle mathbf {y} _{1}} and y 2 {displaystyle mathbf {y} _{2}} are given and the geometry of the two cameras are known, the two projection lines (green lines) can be determined and it must be the case that they intersect at point x (3D point).Using basic linear algebra that intersection point can be determined in a straightforward way. The image to the right shows the real case. The position of the image points y 1 {displaystyle mathbf {y} _{1}} and y 2 {displaystyle mathbf {y} _{2}} cannot be measured exactly. The reason is a combination of factors such as As a consequence, the measured image points are y 1 ′ {displaystyle mathbf {y} '_{1}} and y 2 ′ {displaystyle mathbf {y} '_{2}} instead of y 1 {displaystyle mathbf {y} _{1}} and y 2 {displaystyle mathbf {y} _{2}} . However, their projection lines (blue) do not have to intersect in 3D space or come close to x. In fact, these lines intersect if and only if y 1 ′ {displaystyle mathbf {y} '_{1}} and y 2 ′ {displaystyle mathbf {y} '_{2}} satisfy the epipolar constraint defined by the fundamental matrix. Given the measurement noise in y 1 ′ {displaystyle mathbf {y} '_{1}} and y 2 ′ {displaystyle mathbf {y} '_{2}} it is rather likely that the epipolar constraint is not satisfied and the projection lines do not intersect. This observation leads to the problem which is solved in triangulation. Which 3D point xest is the best estimate of x given y 1 ′ {displaystyle mathbf {y} '_{1}} and y 2 ′ {displaystyle mathbf {y} '_{2}} and the geometry of the cameras? The answer is often found by defining an error measure which depends on xest and then minimize this error. In the following some of the various methods for computing xest presented in the literature are briefly described. All triangulation methods produce xest = x in the case that y 1 = y 1 ′ {displaystyle mathbf {y} _{1}=mathbf {y} '_{1}} and y 2 = y 2 ′ {displaystyle mathbf {y} _{2}=mathbf {y} '_{2}} , that is, when the epipolar constraint is satisfied (except for singular points, see below). It is what happens when the constraint is not satisfied which differs between the methods.

Parent Topic

Child Topic

No Parent Topic