CS 4501: Introduction to Computer Vision Brief Tutorial of Linear Algebra and Transformations Connelly Barnes Slides from Fei Fei Li, Juan Carlos Niebles, Jason Lawrence, Szymon Rusinkiewicz, David Dobkin, Adam Finkelstein, Tom Funkhouser Outline Vectors and matrices Operations Solving linear systems

Representation in Python Transformation matrices 2D Homogeneous coordinates 3D Camera transformations Vectors A column vector

1 1 = []

A row vector 1 = [ 1 Default to column vectors.

] Matrix A matrix

11 1, = , [ Convention: indexed by (row, column). Square matrix: m = n.

] Matrix/Vector Operations

Addition Scaling Transpose Dot product Multiplication Inverse Algebraic properties Matrix Addition

Matrices (and vectors) add elementwise: + = Matrix Scaling Matrix (and vector) product with a scalar multiplies each element: = Transpose

Transpose swaps the rows and columns of a vector or matrix: = Dot Product Inner (dot) product of two vectors multiplies corresponding entries and sums the result: =

= =1 Also written as: x Shortest

angle between x and y Matrix Multiplication Matrix-vector product Av: take dot product between each row of matrix and the column vector: = Matrix-matrix product AB: entry (i, j) is dot product between row i of A and column j of B.

= Matrix Multiplication If A is shape and B is shape then AB is shape . Why? Because we take each row of A dot with each column of B. (And the inner dimension must match to take the product). = Matrix Inverse

Identity matrix is a square matrix with all ones along main diagonal: 1 = 0 0 [ 0 1

0 0 0 1 ] For square matrices, matrix inverse A-1 is defined (when it exists) such that:

-1 A= AA I Beyond the scope of this course to = actually compute the inverse:

instead, just call a matrix inverse library routine. Algebraic Properties For matrices A, B, C and scalar c: Scalar multiplication: Associative: Distributive: Identity: Transpose: Inverse:

Not commutative: in general, Outline Vectors and matrices Operations Solving linear systems Representation in Python Transformation matrices 2D Homogeneous coordinates

3D Camera transformations Solving Linear Systems If we have a linear system Ax = b: = By using our multiplication rule, this is a system of linear equations: 11 1 + 1 2 2=1

2 1 1 + 22 2=2 If matrix A and invertible, then the solution is x = A-1 b What to do if A is non-square or non-invertible? Solving Linear Systems (Pseudoinverse) If we have a linear system Ax = b and the matrix A is non-square or non-invertible, then often cannot solve the system exactly. Usual solution: solve using psuedoinverse A+, defined as:

= (ATA)-1AT = AT (AA+T)-1 (ATA)-1 (AAT)-1

If neither inverse exists, A can still be computed. Solution to the linear system: x = A+ b If system over-determined, solution minimizes least-squares error. If system under-determined, finds solution with smallest norm. Outline Vectors and matrices Operations Solving linear systems Representation in Python

Transformation matrices 2D Homogeneous coordinates 3D Camera transformations Representation in Python

Vector: v = numpy.array([1.0, 2.0]) Matrix: A = numpy.array([[1.0, 2.0], [3.0, 4.0]]) Number of elements in vector: len(v) Number of rows: A.shape[0] (height of image)

Number of columns: A.shape[1] (width of image) Access element in vector: v[i] (i starts at 0) Access element in matrix: v[row, column] (starts at 0) Representation in Python

Vector: v = numpy.array([1.0, 2.0]) Matrix: A = numpy.array([[1.0, 2.0], [3.0, 4.0]])

Elementwise sum: A + A Scalar product: 2*A Dot product: v.dot(v) Matrix product: A.dot(v), A.dot(A) Magnitude (Euclidean length): numpy.linalg.norm(v) Matrix inverse: numpy.linalg.inv(A) Pseudoinverse: numpy.linalg.pinv(A) Solve linear systems: numpy.linalg.solve, numpy.linalg.lstsq Outline

Vectors and matrices Operations Solving linear systems Representation in Python Transformation matrices 2D Homogeneous coordinates 3D Camera transformations

2D Transformations: Translation 2D Transformations: Translation 2D Transformations: Scaling 2D Transformations: Rotation Inverse of any rotation matrix R is just RT.

Transformation Matrices Slide from Fei-Fei Li and Juan Carlos Niebles Homogeneous Coordinates Slide from Fei-Fei Li and Juan Carlos Niebles Homogeneous Coordinates

Slide from Fei-Fei Li and Juan Carlos Niebles 2D Translation Using Homogeneous Coordinates = 2D Transformations: Scaling

= 2D Rotation Using Homogeneous Coordinates = c os = sin 1

0 [ ][ sin cos 0 0 0

1 1 ][ ] 3D Homogeneous Coordinates Represent a 3D point as:

1 [] If last coordinate ends up non-one, convention is to divide by it:

/ = / / 1 [] [ ] 3D Homogeneous Coordinates

Translation by (tx, ty, tz): = 3D Homogeneous Coordinates Scaling by (sx, sy, sz): =

3D Homogeneous Coordinates Rotation around z axis: = z (Rotates counterclockwise in right-handed coordinate system when axis is pointed towards observer) 3D Homogeneous Coordinates

Rotation around x axis: = x 3D Homogeneous Coordinates Rotation around y axis: = y

Rotation Mathematics Inverse of any rotation matrix R is just RT. Euler angles: any 3D rotation matrix R can be written as the composition of three rotation matrices about the three axes: R = Rx() Ry() Rz() How many parameters for an arbitrary 3D rotation matrix? Outline

Vectors and matrices Operations Solving linear systems Representation in Python Transformation matrices 2D Homogeneous coordinates 3D Camera transformations

Pinhole Camera Center of Projection Illustration by Steve Seitz Perspective Transformation Perspective Transformation

Perspective Transformation Perspective projection is the result of projecting from a point in a 3D scene to a 2D pixel coordinate on a camera sensor. In homogeneous coordinates: Perspective Transformation Perspective projection is the result of projecting from a point in a 3D scene to a 2D pixel coordinate on a camera sensor. Often we do not care about the depth (z) associated with a given (x, y) pixel. So we can remove that row and write as a 3x4 matrix:

Combining Transformations If we first translate (T), then scale (S), then rotate (R), then perspective project (P), we can model this as: p' = P(R(S(Tp))) The transformations are applied in order from right to left. Can use associativity to combine multiple transformations into a single transformation matrix M:

p' = Mp= (PRST)p Camera Matrices: Intrinsic and Extrinsic Camera matrix: a 3x4 matrix that transforms from 3D scene points (x, y, z) to 2D screen points (x/w, y/w): Intrinsic parameters model properties within the camera and lens: focal length, image sensor format, optical center. Extrinsic parameters model the rotation and translation of a camera within a 3D scene.

Camera Matrices: Intrinsic and Extrinsic Can rewrite the camera matrix in terms of intrinsic and extrinsic parameters. K= Intrinsic matrix , focal length in pixels, skew between x and y axes (often zero), , principal point (typically center of image).

Camera Matrices: Intrinsic and Extrinsic Can rewrite the camera matrix in terms of intrinsic and extrinsic parameters. Extrinsic parameters t: 3x1 vector representing origin of scene coordinate system represented in camera coordinates. R: 3x3 rotation matrix that rotates from scene coordinates to camera

coordinates. Next classes Camera calibration Stereo Optical flow