What? Why?

Epipolar geometry describes the geometric relations in image pairs. It enables faster search for prediction of corresponding points between images by reducing the search space from 2D to 1D.
Not only this but this 1D is special because in your image you might have multiple objects which are similar in nature and you can get a wrong correspondence by traversing the entire image. But with epipolar geometry you are reduced to searching in this one line which is somewhat geometrically consistent and your chances of getting wrong correspondences are reduced.

Pasted image 20251122192049.png

Terms:

In the Epipollar Plane..

Using a distortion-free lens,

Epipolar Constraint

If the point lies on the image plane, then its corresponding point must lie on the corresponding epipolar line. This is called the epipolar constraint.

Also for any point x lying on a line l, the relation xTl=0 holds true.
Therefore,

xTl=0

and exploiting the coplanarity of the points X',X''.

xTFx=0

As l=Fx, where F is the fundamental matrix.
The epipoles satisfy the relations:

eTFx=0eTFTx=0

Epipoles are the null space of the fundamental matrix.

null(FT)=enull(F)=e

They correspond to an eigenvalue of 0.

Fundamental Matrix

xFx=0

Where x and x are corresponding points in the two images.
So if we have n corresponding points, we can set up a system of linear equations to solve for the entries of F.

[x1x1x1y1x1y1x1y1y1y1x1y11x2x2x2y2x2y2x2y2y2y2x2y21........1xnxnxnynxnynxnynynynxnyn1][f11f12f13f21f22f23f31f32f33]=0Af=0

Where A is the matrix of coefficients from the corresponding points, and f is the vector of entries of F. A is of size n x 9. and it's rank is 8.
Use normalization to improve numerical stability. Solve using SVD. Enforce rank-2 constraint by setting the smallest singular value to zero.

F=UDVT

if you want rank 3:

F=U1σ1V1T+U2σ2V2T+U3σ3V3T

Enforce rank 2, or min||FF||F

F=U1σ1V1T+U2σ2V2T

Where ||.||F is the Frobenius norm.

Computation of F is done over n correspondence, to improve we can run RANSAC to filter out outliers.

Note: F is computed on the transformed image if I=TI then F=TTFT

other methods for finding F: Normalized 8-point algorithm, 7-point algorithm, 5-point algorithm.

Essential Matrix

The essential matrix E is a 3x3 rank 2 matrix that relates corresponding points in calibrated stereo images.

so $$F = K''^{-T} E K'^{-1}$$

xTEx=0

Where x and x are normalized image coordinates (after removing the effect of intrinsic parameters).

Chirality and Pose Recovery

We get 4 motion pairs from E=[t]xR

E=UΣVT,Σ=diag(1,1,0)

So 4 possible pairs would be:

( The following is not true, pls read page 277/673 in Multiple View Geometry for correct understanding ) ( It's Section 9.6.2 in the book )

R=UWVT,t=+u3R=UWVT,t=u3R=UWTVT,t=+u3R=UWTVT,t=u3

Where W=[010100001]
To determine the correct pair, we can use the chirality condition which states that the reconstructed 3D points must lie in front of both cameras. We can triangulate a point using each of the four motion pairs and check the depth values. The correct motion pair will yield positive depth values for the majority of points.

Triangulation