From P we can decompose to get K, R, t using RQ Decomposition.
After solving the DLT system
Step 1: Split P into
Write the
where:
is the left block is the last column ( sorry for the python notation 😭)
From the camera model:
So if we can factor
Step 2: Fix the Scale of P
P is only defined up to scale.
Multiply P by any nonzero scalar and it still represents the same camera.
You usually pick a convention like:
- or
- or normalize
Pick one normalization and stick with it. STICK WITH IT.
Step 3: RQ Decomposition of M
We want:
with:
upper triangular orthonormal (rotation matrix)
This is exactly what RQ decomposition gives you. ( note: not QR decomposition )
So you compute:
Now you have initial versions of the intrinsic matrix (K) and rotation matrix (R).
Step 4: Fix Camera Conventions
Raw RQ output may be awkward. Fix it.
1. Make diagonal entries of K positive
If some diagonal
- multiply column
of by - multiply row
of by
This preserves
2. Ensure
If
- flip the sign of one column of
and corresponding row of
Now:
has positive focal lengths is a proper rotation
All good.
Step 5: Recover Translation t
We know:
Thus:
Now you have the translation vector.
Step 6: (Optional) Compute Camera Center C
The camera center
Assuming
And note:
This gives a sanity check between
In short:
Given
- Split:
- Normalize P (fix scale).
- RQ-decompose
. - Fix signs so diagonal of
is positive and . - Compute translation:
- (Optional) Compute camera center:
Code:
import numpy as np
from scipy.linalg import rq
def decompose_projection_matrix(P):
"""
Decomposes P = K [R | t] using RQ decomposition.
Args:
P: 3x4 Projection Matrix
Returns:
K: 3x3 Intrinsic Matrix
R: 3x3 Rotation Matrix
t: 3x1 Translation Vector
C: 3x1 Camera Center in World Coordinates
"""
# 1. Split P into M (3x3) and p4 (3x1)
M = P[:, 0:3]
p4 = P[:, 3]
# 2. RQ Decomposition of M
# scipy.linalg.rq returns K, R such that M = K @ R
# K is upper triangular, R is orthogonal
K, R = rq(M)
# 3. Fix Signs (Enforce positive diagonal for K)
# The decomposition is not unique. We need K[i,i] > 0.
# If K[i,i] < 0, we negate the i-th column of K and i-th row of R.
T = np.diag(np.sign(np.diag(K)))
K = K @ T
R = T @ R
# 4. Enforce det(R) = 1 (Proper Rotation)
# If det(R) = -1, it's a reflection. Negate the whole matrix?
# Actually, standard RQ usually handles this, but strictly:
# If det(R) < 0, we might have a coordinate system flip.
# Usually scaling P by -1 fixes this if P was homogeneous.
# Normalize K so K[2,2] = 1
K = K / K[2, 2]
# 5. Recover Translation t
# p4 = K @ t => t = inv(K) @ p4
t = np.linalg.inv(K) @ p4
# 6. Recover Camera Center C
# C = -inv(M) @ p4
C = -np.linalg.inv(M) @ p4
return K, R, t, C
if __name__ == "__main__":
# Ground Truth
K_true = np.array([[1000, 0, 320], [0, 1000, 240], [0, 0, 1]])
# Rotation 90 deg around Z
R_true = np.array([[0, -1, 0], [1, 0, 0], [0, 0, 1]])
# Translation
t_true = np.array([10, 20, 5])
# Construct P
P_true = K_true @ np.column_stack((R_true, t_true))
print("Ground Truth P:\n", P_true)
# Decompose
K_est, R_est, t_est, C_est = decompose_projection_matrix(P_true)
print("Estimated Parameters")
print("K:\n", np.round(K_est, 2))
print("R:\n", np.round(R_est, 2))
print("t:", np.round(t_est, 2))
print("Camera Center C:", np.round(C_est, 2))
# Sanity Check: Does K[R|t] reconstruct P?
P_reconst = K_est @ np.column_stack((R_est, t_est))
print("\nReconstructed P:\n", np.round(P_reconst, 2))