Define the Problem:
Given K and set of N 3D world points to 2D pixel correspondences, find R and t.
Reprojection Error:
The optimal pose is found my minimizing reprojection error. It is the error between "observed" pixel coordinates and "predicted" pixel coordinates obtained by projecting the 3D points using the estimated pose.
where
Cost Function:
The cost function to minimize is the sum of squared reprojection errors over all N points:
where
The Optimization Problem (Known Correspondences)
This is NLS problem.
We linearize the residuals using a first-order Taylor expansion around the current estimate.
Let the update be a small perturbation in the Lie Algebra
where
The update rule is:
The Jacobian of the residual
The Interaction Matrix (Jacobian)
Using the chain rule:
- Projection Derivative (
): How pixel coordinates change with camera coordinates.
- Pose Derivative (
): How camera coordinates change with pose perturbation.
Multiplying these gives the full
(Note: The negative sign comes from
The Algorithm (Gauss-Newton)
- Initialize
(e.g., using P3P or identity). - Iterate:
a. Project 3D points using currentto get .
b. Compute residuals.
c. Compute Jacobianfor each point.
d. Stack(matrix ) and (vector ).
e. Solve normal equations:.
f. Update pose:.
g. Check convergence.
Code: Pose Estimation (PnP)
Use cv2.solvePnP. It implements exactly the algorithm above (uses Levenberg-Marquardt).
import cv2
import numpy as np
def solve_pnp_example():
# Camera Matrix (K)
# fx=800, fy=800, cx=320, cy=240
K = np.array([[800, 0, 320], [0, 800, 240], [0, 0, 1]], dtype=np.float32)
# Distortion Coefficients (Assuming zero for simplicity, but PnP handles them)
dist_coeffs = np.zeros((4, 1))
# 2. DATA (3D Points and 2D Observations)
# Define a 3D object (e.g., a rectangle in space)
# Points in Object Frame (X, Y, Z)
object_points = np.array(
[[0, 0, 0], [1, 0, 0], [1, 1, 0], [0, 1, 0]], dtype=np.float32
)
# Define the true pose (to generate synthetic 2D points)
# Rotate 45 deg around Y, Move to Z=10
rvec_true = np.array([0, np.pi / 4, 0], dtype=np.float32)
tvec_true = np.array([0, 0, 5], dtype=np.float32)
# Project points to create "Observations"
image_points, _ = cv2.projectPoints(
object_points, rvec_true, tvec_true, K, dist_coeffs
)
# Add a little noise to simulate real world
# Doing the bottom causes Error as image_points is of float32
# image_points = image_points + np.random.normal(0, 0.5, image_points.shape)
noise = np.random.normal(0, 0.5, image_points.shape).astype(np.float32)
image_points = image_points + noise
print("Observed Image Points:\n", image_points.reshape(-1, 2))
# 3. SOLVE PNP
# Method: cv2.SOLVEPNP_ITERATIVE (Levenberg-Marquardt optimization)
success, rvec_est, tvec_est = cv2.solvePnP(
object_points, image_points, K, dist_coeffs, flags=cv2.SOLVEPNP_ITERATIVE
)
if not success:
print("PnP solution failed.")
return
print("Estimated Pose ( R and t ) ")
print("Rotation Vector (Rodrigues):\n", rvec_est.flatten())
print("Translation Vector:\n", tvec_est.flatten())
# 4. VERIFICATION
# Calculate Reprojection Error
projected_points, _ = cv2.projectPoints(
object_points, rvec_est, tvec_est, K, dist_coeffs
)
error = cv2.norm(image_points, projected_points, cv2.NORM_L2) / len(
projected_points
)
print(f"\nReprojection Error (RMS per point): {error:.4f} pixels")
# Convert Rotation Vector to Matrix
R_est, _ = cv2.Rodrigues(rvec_est)
print("\nRotation Matrix:\n", np.round(R_est, 3))
# Print Ground Truth
print("\nGround Truth Pose ( R and t ) ")
print("Rotation Vector (Rodrigues):\n", rvec_true.flatten())
print("Translation Vector:\n", tvec_true.flatten())
# Rotation Matrix
R_true = cv2.Rodrigues(rvec_true)[0]
print("Rotation Matrix:\n", np.round(R_true, 3))
# Verify Rotation Matrix
print("\nVerification of Rotation Matrix:")
print("Estimated Rotation Matrix:\n", np.round(R_est, 3))
print("Ground Truth Rotation Matrix:\n", np.round(R_true, 3))
if __name__ == "__main__":
solve_pnp_example()