# Multiple View Geometry in Computer Vision Epipolar Geometry Two-view geometry Epipolar geometry 3D reconstruction F-matrix comp. Structure comp. Three questions: (i) Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point x in the second image? (ii) Camera geometry (motion): Given a set of corresponding image points {xi xi}, i=1,,n, what are the cameras P and

P for the two views? (iii) Scene geometry (structure): Given corresponding image points xi xi and cameras P, P, what is the position of (their pre-image) X in space? The epipolar geometry (a) C,C,x,x and X are coplanar The epipolar geometry If we know x, how is the corresponding point x constrained?

l is the Epipolar line corresponding to point x Upshot: if we know C and C for a stereo correspondence algorithm, no need to search all over the second image, but just only over the epipolar line. C C

b What if only C,C,x are known? The epipolar geometry Baseline: connects two camera centers Epipole: point of intersection of baseline with image plane Epipole: image in one view of the camera center of the other view. C

C a All points on p project on l and l The epipolar geometry Epipolar plane: A plane containing the baseline. There is a one parameter family , or a pencil, of epipolar planes Epipolar line is the intersection of an epipolar plane with the image plane All epipolar lines intersect at the epipole An epipolar plane intersects the left and right image planes in epipolar lines, and defines the correspondence between the lines.

b Family of planes p and lines l and l Intersection in e and e The epipolar geometry epipoles e,e = intersection of baseline with image plane = projection of projection center in other image = vanishing point of camera motion direction an epipolar plane = plane containing baseline (1-D family) an epipolar line = intersection of epipolar plane with image (always come in corresponding pairs) Example:

converging cameras Example: motion parallel with image plane Fundamental Matrix F is a projective mapping x l from a point x in one image to its Corresponding epipolar line in the other image l = Fx The fundamental matrix F algebraic representation of epipolar geometry x l' we will see that mapping is (singular) correlation

(i.e. projective mapping from points to lines) represented by the fundamental matrix F Skew Symmetric Matrix for a vector a [a]x is skew symmetric matrix for vector a If a = (a1 , a2 , a3 )T then, [a]x = [ 0 -a3 a2 a3 0 -a1 -a2 a1 0 ]

Cross product between two vectors a and be can be written in terms of skew symmetric matrix for a: axb = [a]x b The fundamental matrix F geometric derivation Plane , not passing through either of the

camera centers Ray through C corresponding to image point x, meets plane in a point in 3D called X. Project X to a point x in the second image Transfer via the plane . l is the epipolar line for x x must like on l l' e'x' x and x are projectively equivalent to the planar point set Xi There is a 2D homography mapping each xi to xi

C H x' H x l' e'x' e' H x Fx mapping from 2-D to 1-D family (rank 2) C The fundamental matrix F P+ is pseudo inverse of P algebraic derivation

P P I l' P' C P' P x Px X Line l joints two points: can be written as cross product of those two points: First point is PC which is e Second point is projection P of P+x onto second image plane l = e cross product with ( P P+ x ) F e' P' P

C (note: doesnt work for C=C F=0) C The fundamental matrix F correspondence condition The fundamental matrix satisfies the condition that for any pair of corresponding points xx in the two images Combine these two: x'

T l' 0 l = Fx x'T Fx 0 Upshot: A way of characterizing fundamental matrix without reference to camera matrices, i.e. only in terms of corresponding image points How many correspondences are needed find F? at least 7. The fundamental matrix F F is the unique 3x3 rank 2 matrix that

satisfies xTFx=0 for all xx (i) Transpose: if F is fundamental matrix for (P,P), then FT is fundamental matrix for (P,P) (ii) Epipolar lines: for any point x in the first image, the corresponding epipolar line is l = Fx ; same with converse: l = FT x represents the epipolar line corresponding to x in the second image (iii) Epipoles: for any point x, the epipolar line l = Fx contains the epipole e. Thus eTFx=0, x eTF=0; similarly Fe=0 e is the left null vector of F; e is the right null vector of F (iv) F has 7 d.o.f. , i.e. 3x3-1(homogeneous)-1(rank2) (v) F is a correlation, projective mapping from a point x to a line l=Fx (not a proper correlation, i.e. not invertible) If l and l are corresponding epipolar lines, then any point x on l is mapped to the same line l no inverse mapping F not proper correlation

Epipolar Line Homography Set of epipolar lines in each of the images forms a pencil of lines passing through the epipole Such a pencil of lines is a 1D projective space Epipolar lines are perspectively related There is a homography between epipolar lines centered at e in the first view and the pencil centered at e in the second . A homography between such 1D projective spaces as 3 degrees of freedom

Count degrees of freedom of fundamental matrix: 2 for e, 2 for e, 3 for 1D homography total of 7 The epipolar line homography l,l epipolar lines, k line not through e l=F[k]xl and symmetrically l=FT[k]xl k l k l e Fk l e'

(pick k=e, since eTe0) l' F e l l FT e' l' Epipolar Line Homography Pure Translation camera motion Pure translation = no rotation; no change in internal parameters Pure translation of camera is equivalent to camera is stationary and the world undergoes a translation t; Points in 3 space move on straight lines parallet to t Fundamental matrix for pure translation

a Forward motion B c Fundamental matrix for pure translation P = K [ I | 0] ; P = K [ I | t] F e' H e' example:

K 1RK Camera translation parallel to x axis T e' 1,0,0 H 0 0 0 x'T Fx 0 y y'

F 0 0 - 1 0 1 0 x PX K[I | 0]X -1 K x' P' X K[I | t] x Z ( X,Y,Z ) T K -1x/Z x' x Kt/Z Z = depth of point X = distance of X from the camera center measured along the principal axis of the first camera

Motion starts at x and moves towards e; extent of motion depends on magnitude of t and Inversely proportional to Z; points closer to camera appear to move faster than those further away faster depending on Z looking out of the train windowlooking looking out of the train windowout looking out of the train windowof looking out of the train windowthe looking out of the train windowtrain looking out of the train windowwindow pure translation: F only 2 d.o.f., xT[e]xx=0 auto-epipolar P = K [I | 0] ; P = K [R | t]; General motion Rotate the first camera so that it is

aligned with the 2nd camera projective transformation Further correction to the first image to account for differences in calibration matrices Result of these two corrections is projective transformation H of the first image: H = K R K-1 Then the effective relation between 2 images is pure translation x 'T e' Hx 0 x 'T e' x 0 x' K' RK -1x K' t/Z First term depends on image position, x, but not points depth Z, and takes into account camera rotation R and change of internal parameters

Second term depends on depth, but not image position x, and takes account of translation Projective transformation and invariance F invariant to transformations of projective 3-space Consequence; F does not depend on the choice of world frame P, P' F F P, P' unique not unique canonical form for a pair of camera matrices for a given F: Choose first Camera as P = [I | 0 ]

P [I | 0] F m M P' [M | m] Projective ambiguity of cameras given F previous slide: at least projective ambiguity this slide: no more than projective ambiguity! ~~ Show that if F is same for (P,P) and (P,P), there exists a projective transformation H so that ~ ~ P=HP and P=HP Upshot: a given fundamental matrix determines the pair of camera matrices

up to a right multiplication by a projective transformation. P and P each have 11 dof ( 3 x 4 1 = 11). Total of 22 dof for both; Projective world frame H has ( 4 x 4 1= 15 ) dof; Remove dof of world frame from the two cameras: 22 15 = 7 = dof of Fundamental matrix. Canonical cameras given F F matrix corresponds to P,P iff PTFP is skew-symmetric X P' T Possible choice: S = [e]x T

FPX 0, X The essential matrix ~fundamental matrix for calibrated cameras; remove K to get Essential matrix P = K [ R | t]; x = PX; known K x = K-1 x = [ R | t ] X x is the image point expressed in normalized coordinates; image of point X w.r.t. camera [R | t] having identity I as calibration matrix E t R R[R T t] Definition: Relation between E and F x'T Ex 0

x K E K'T FK 5 d.o.f. (3 for R; 2 for t up to scale) x; x' K -1x' -1 Four possible reconstructions from E (only one solution where points is in front of both cameras)