Polynomial Embeddings for Quadratic Algebraic Equations Radu Balan University of Maryland, College Park, MD 20742 Math-CS Joint Lecture, Drexel University Monday April 23, 2012 Overview 1. Introduction: Motivation and statement of problem 2. Invertibility Results in the Real Case 3. The Algebraic Approach: a. Quasi-Linear Embeddings b. Hierarchical Embeddings c. Numerical Analysis

4. Modified Least Square Estimator 5. Theoretical Bounds: CRLB 6. Performance Analysis 2/50 1. Introduction: Motivation Inversion of Nonlinear Transformations ? knowns y=Ax x By BA=Identity Question: What if |y| is known instead

(that is, one looses the phase information) ? z | Ax | x ? Where is important: X-Ray Crystallography, Speech Processing 3/50 1. Introduction: Statement of the Problem Reconstruction from magnitudes of frame coefficients a complete set of vectors (frame) for the n-dimensional Hilbert space H (H=Cn or H=Rn). Equivalence relation: x,yH, x~y iff there is a scalar z, |z|=1 so that x=zy (real case: x=y ; complex case: x=eiy). Let . Define Problems:

1. Is an injective map? 2. If it is, how to invert it efficiently? 4/50 H f1 f2 x (x) fm

Rm 5/50 2. Invertibility Results: Real Case (1) Real Case: K=R Theorem [R.B.,Casazza, Edidin, ACHA(2006)] is injective iff for any subset JF either J or F\J spans Rn. Corollaries [2006] if m 2n-1n-1, and a generic frame set F, then is injective; if m2n-1n-2n-1 then for any set F, cannot be injective; if any n-element subset of F is linearly independent, then is injective; for m=2n-1n-1 this is a necessary 6/50

and sufficient condition. Invertibility Results: Real Case (2) Real Case: K=R Theorem [R.B.(2012)] is injective iff any one of the following equivalent conditions holds true: For any x,yRn, x0, y0, There is a constant a>0 so that for all x, RI 7/50 Invertibility Results: Real Case (3) complete set of vectors (frame) for H=Rn. One would expect that if is injective and m>2n-1n-1 then there is a strict subset

J{1,2n-1,,m} so that is injective, where :RRmR|J| is the restriction to J index. However the next example shows this is not the case. Example. Consider n=3, m=6, and F the set of columns of the following matrix F= Note that for any subset J of 3 columns, either J or F\J is linearly dependent. Thus is injective but removing any column makes not injective. 8/50 3. The Algebraic Approach 3.1 Quasi-Linear Embeddings Example (a) Consider the real case: n=3 , m=6. Frame vectors: Need to solve a system of the form:

9/50 [ 1 1 1 1 1 1 2 4 2 4 2 0

2 2 4 6 14 2 1 4 1 4 1 0 2 2

4 12 14 0 1 1 4 9 49 1 1 4 2 9 3

4 = = 9 4 196 5 4 6 ][ ] [ ] 1 1 2 1

2 4 [] Then factor: Thus, we obtain: 10/50 Summary of this approach: where 11/50

Summary of this approach: where 12/50 Example (b) Consider the real case: n=3 , m=5. Frame vectors: Need to solve the system: X Is it possible? How? 13/50

Lets square again: 12+ 2 1 2+ 2 1 3 + 22+ 2 2 3 + 23 |1 + 2 + 3|=2 2 2 2 1 +4 1 2 2 1 3 + 4 2 2 2 3+ 3 | 1 +2 2 3|=3 | 1 2 2 3|=2 21 2 1 2 4 1 3 + 22 +4 2 3 + 4 23 | 1 2 2 3 3|=3 21 4 1 2 6 1 3 + 4 22+12 2 3 +9 23 | 1+ 2 7 3|=14 21 +2 1 2 14 1 3 + 22 14 2 3 + 49 23 4

9 4 9 196 We obtained 5 linear equations with 6 unknown monomials. Idea: Lets multiply again these equations (square and cross) New equations:

: 15 equations New variables (monomials): : 15 unknowns 14/50 Summary of this approach: where 15/50 3.2 Hierarchical Embeddings Primary data: Level d embedding:

, Identify: Then a homogeneous polynomial of total degree 2d. 16/50 How many monomials? Real case: number of monomials of degree 2d in n variables: Complex case: Number of degree (d,d) in n variables: Define redundancy at level d: (R.B. [SampTA2009]) 17/50

Real case: 18/50 Real case: 19/50 Real case: 20/50 Complex case: 21/50

Complex case: 22/50 Complex case: 23/50 Fundamental question: How many equations are linearly independent? Recall: , Note: and The matrix is not canonical, and so is * for d>1. However its range is basis independent. We are going to compute this range in terms of a canonical matrix.

24/50 Theorem The following hold true: 1. (as a quadratic form) 2. Rows of are linearly independent iff where is the mdxmd matrix given by Real case: Complex case: 25/50 Let denote the Gram matrix And for integer p.

Theorem In either real or complex case: Hence for d=1, the number of independent quadratics is given by: Theorem For d=2, Remark Note the k1=k2n-1,l1=l2n-1 submatrix of is 26/50 3.3. Numerical analysis Results for the complex case: random frames n=3,m=6 27/50

n=3,m=6 28/50 n=16 m=64 29/50 n=32 m=128 30/50 n=4 m=16

31/50 n=4 m=14 32/50 Note 6 zero eigenvalues of instead of 5. n=4 m=15 33/50 Number of zero eigenvalues = 20 =120-100, as expected. 4. Modified Least Square Error

Estimator Model: d i x, f i 2 i , 1 i m Vanilla Least-Square-Estimator (LSE): m x LSE arg min x d i x, f i 2 2 i 1

Rewrite the criterion equivalently as : m tr XFi di 2 , Fi f i f i T , X xx T i 1 We modify this criterion in two ways: 1. Replace X by a rank r positive matrix Y 34/50 2. Regularize the criterion by adding a norm of Y

m 2 Y arg minY 0,rank (Y ) r tr (YFi ) d i tr (Y ) i 1 x princ.eigval(Y ) princ.eigvect(Q ) Use a rank-r factorization of Y to account for constraint: L arg min LR nr m tr L F L d

2 T i i T tr L L i 1

SDV factorization : L UV x 1,1U (:,1) 35/50 Optimization procedure L arg min LR nr m tr L F L d 2 T i

i T tr L L i 1 Our approach: 1. Start with a large and decrease its value over time; 2. Replace one L in the inner quadratic term by a

previous estimate 3. Penalize large successive variations. 36/50 m T 2

T J t ( L, K ) tr L Fi K di t tr LT L t tr L K L K t tr K T K i 1 Algorithm (Part I) How to initialize? How to adapt? Choose an adaptation policy for t , t t 1

Step I: Initialization Initialize L( 0) , 0 , 0 , t 0 Step II: Iteration ( t 1) L arg min L J t L, L (t ) Convergence? t t 1

Step III: Factorization ( ) SVD L UV x 1,1U (:,1) 37/50 Initialization For large and small L : m

2 m J ( L) tr LT Fi L d i 2 tr LT L 2tr LT Q L d i2 i 1 i 1

m where : Q d i Fi i 1 Let ek , vk be eigenpairs of Q : Qvk ek vk , e1 e2 Set: L( 0) 1 v1 | 2 v2 | | 0 e1 ... e r / r , k

r vr r0 m r 2 i 1 k 1 v k , f i 2 38/50 Convergence m

T 2 T

J t ( L, K ) tr L Fi K di t tr LT L t tr L K L K t tr K T K i 1 Note : J t L, K J t K , L Consider the iterative process: ( t 1) L (t ) :arg min L J t L, L

( t 1) jt : J t L (t ) ,L Theorem Assume (t)t,(t)t are monotonically decreasing non-negative sequences. Then (j t)t0 is a monotonically decreasing convergent sequence.

39/50 5. Theoretical bounds: CRLB Model : d i x, f i 2 i , 1 i m ; i ~ N (0, 2 ), i.i.d . 1 Loglikelihood : log p(d | x) 2 2 m

i 1 x, f i 2 2 di m log(2 ) m log( ) 2 2 log p(d | x)

Fisher Information Matrix : I k , j : E x x k j m 2 4 I 2 R , where R x, f i f i f i T i 1

2 CRLB : COVAR UnbiasedEstimator I 1 R 1 4 2 2 1 [ ^ ] ( ) 4 [ ^ ^

2 ] 2 2 1 1 [ ( ) + ] 2 40/50 The LSE/MLE is a biased estimator. Modified CRLB for biased estimators:

T 1 T E x x x x I Bias Bias : ModifiedCRLB x x ( x) E x , Bias ( x) x

T Asymptotically (for high SNR): m 1 Bias 2 , x, f i f i T R 1 f i R 1 f i 4 i 1 Id 2 , x 2 1 4 MSE R

R 1 R 1 T h.o.t.( 6 ) 4 4 MSE0 MSE1 41/50 6. Performance Analysis Mintrace algorithm: Candes, Strohmer, Voroninski (2011) n=3 , m=9 , d = 1 (dlevel) , r = 2 and , decrease by 5% every step w/ saturation (subspace) 42/50

43/50 n=3 , m=9, d=1, r=2 44/50 n=3 , m=9, d=1, r=2 45/50 n=8 , m=24, d=3, r=2 46/50 n=8 , m=24, d=3, r=2 47/50

n=8 , m=24, d=3, r=2 48/50 n=9 , m=27, d=3, r=2 49/50 n=9 , m=27, d=3, r=2 50/50 n=9 , m=27, d=3, r=2