Hardness of Approximately Solving Linear Equations Over Reals Dana Moshkovitz MIT Joint work with Subhash Khot, NYU 1 Best Approximations Known For MAX-CUT: cut with (1-) fraction 2CSPIf exists vs. 3CSP of edges, efficiently cut at least (1-c) fraction of edges. OR XOR

AND 2 0.931.. [FG95] 0.878... [GW92] 0.859.. [FG95] 3 7/8 1/2 1/2 r e tt e b y

n : a n o ti a m i x o ] r 7 p 9 d a

ap st H [ d r a h NP2 The Complexity of Approximating 3XOR [=3LIN(GF(2))] [AroraSafra92, AroraLun)dMotwan)iSudan)Szegedy92, Raz94, BellareGoldreichSudan)95, Hstad97, MRaz08] Assumin)g it takes 2(n)) time to solve 3SAT exactly on) size n) in)puts. runningtime

2(n)) exponential Exponential hardness: time 2n1-o(1) Sharp threshold: Drop at +o(1) n)O(1) polynomial 1/2

1 Approx. factor 3 Best Approximations Known For 2CSP vs. 3CSP OR XOR AND Any b etter 2 NP-ha approxim ation: rd ass [FG95] Un)iqu 0.931..

e Gam umin)g the es Co[GW92] [K020.878... n)jectu ,KKM re O04,[FG95] 0.859.. R08] 3 7/8 1/2 r e tt e b y

1/2 n : a n o ti a m i x o r ] p 7 9 d

ap sta H [ d r h NP Hard time showing easy problems are hard! 4 Proving Hardness of Constraint Satisfaction Problems The Bellare-Goldreich-Sudan-Hstad paradigm 2CSP(GF(q))

composition with longcode/dictator code desired problem [Khot02]: Start with 2LIN(GF(q)). The Unique Games Conjecture: 2LIN(GF(q)) is extremely hard to approximate. 5 The Unique Games Conjecture (UGC) The Hardness of 2LIN Over Large Alphabets Un)ique Games Con)jecture [Khot02, formulation by KKMO04]: For all ,>0, for sufficiently large q, given a set of linear equations of the form xi xj = c (mod q)

where 1- fraction of the equations can be satisfied, it is NP-hard to find an assignment that satisfies fraction of the equations. [Raghavendra08]: Assuming the Unique Games Conjecture, the best approximation for all CSPs is 6 obtained by rounding a natural SDP. Alphabet/Hardn)ess Rules of Thumb: As alphabet gets larger, problem gets harder. Importan)t Exception): problem can (but not necessarily) become easy if alphabet is R: linear/semidefinite programming can be used. Typically, Easy, if allow fractional solutions Hard, if encode large discrete alphabets (e.g., exact 3LIN(R) [GR07]) 7

Work with Subhash Khot, 2010: A n)atural optimization) problem: find a balanced assignment for real homogeneous linear equations: x15 - x231 + x37 = 0 x1 - 2x3 + x89 = ... Margin) of equation 0 ax+by+cz = 0 is | ax+by+cz| SDP-based algorithm: Best approximation known

obtained by rounding natural SDP (when 1- equations exactly satisfiable, get margin O() for (1) of equations), Un)like GF(q): Same for 2 or 3 variables per equation)! We show NP-hardn)ess for 3 vars per equation)! Approach to provin)g the Un)ique Games Con)j. 8 Main Theorem Theorem [KM10]: For any >0, it is NP-hard, given a system of linear equations over reals where (1-) fraction can be exactly satisfied by a balanced assignment, to find an assignment where 0.99 fraction of the equations have margin at most 0.0001. Tight: Efficient algorithm gives margin O() for (1) of equations.

Blow up: Reduction from SAT has blow-up nnpoly(1/), matching the recent result of Arora, Barak, Steurer. 9 Approach to Proving The Unique Games Conjecture 1. Show that approximate 3LIN over the reals is NP-hard. 2. Show that approximate 2LIN over the reals is NP-hard, possibly by reduction from 3LIN. 3. Deduce the Unique Games Conjecture using parallel repetition. 10 SDP Approximation Algorithm

Min E[| ax + by + cz |2] s.t. assignment is balanced An)alysis & Roun)din)g. Assume (1-) equations exactly satisfiable. Then the SDP minimum is at most O(). Solve the SDP to get vector assignment. Pick random Gaussian , get real assignment xi= with similar value. The margin is O() for constant fraction of equations. 11 Hardness of Approximately Solving Real Linear Equations with Three Variables Per Equation 12

Proving Hardness of Approximate Real Equations 1.Dictatorship test over reals. 2.Adapting the paradigm The Bellare-Goldreich-Sudan-Hstad paradigm 2CSP(GF(q)) composition with longcode/dictator code desired problem dictatorship test for desired problem 13

Real Functions Variables correspond to points in Rn. Assignments correspond to functions RnR. We consider the Gaussian distribution over Rn, where each coordinate has mean 0 and variance 1. Fact: If x,y are independent Gaussians, then px+qy is Gaussian where p2+q2 = 1. 14 Dictator Testing Function F:RnR. Two possibilities for F. A tester queries three positions x,y,zRn, tests aF(x)+bF(y)+cF(z) = 0 ? If dictator, equation satisfied exactly with probability 1-. If not approximated by linear junta, there is a

margin 0.001 with probability 0.01. F Fisisa adictator, dictator, i.e., i.e.,F(x F(x 1,,x n)=x i i 1,,x n)=x F Tester No Nolinear

linearFFjunta junta (poly(1/) (poly(1/)coord) coord)with with |F-F| |F-F| 2|F| 2 2 2|F| 15 Noise Sensitivity Approach to Dictator Testing Pick Gaussian x2Rn. Perturb x by re-sampling each coordinate with probability . Obtain x.

Check F(x)=F(x). 16 Problem with Noise Sensitivity Approach For F half-space, With probability 1-c, equality holds. With probability c, constant margin. We want: with probability 0.01, margin 0.001. 17 Linear Non-Juntas Are Noise-Sensitive

Observation): If F = i2I aixi for |I| 1/2, then (F(x) F(x)) is Gaussian with variance . Hence |F(x)-F(x)| is 0.001 with prob0.01. Basic Idea: test linearity and noise sensitivity. 18 Hermite Analysis Fourier Anlysis for Real Functions Fact: For any F:RnR with bounded |f|2, can write F = ci1,,inHi1(x1) Hin(xn), where Hd is a polynomial (Hermite polynomial) of degree d. Notation: F1 is the linear part of F, and F>1(x) is its non-linear part. 19

Linearity Testing Equation depends Pick Gaussian x,y2Rn, Pick p2[0,1]; set q=(1-p2). on three variables! Check F(px+qy)=pF(x)+qF(y). Lemma: |F-F1|22 E[Margin2]. Proof: Via Hermite analysis. 20 Decompose into linear and non-linear part: 2 Ex|F(x)-F(x)| = Ex|F1(x)-F1(x)|2 + Ex|F>1(x)-F>1(x)|2 Non junta ) with prob 0.1 margin 0.1

Linearity test ) | F>1(x)|2 0.001 Cancellation problem: On average |F>1(x)-F>1(x)|0.001, but it can be much larger when |F1(x)-F1(x)| is large and cancel |F1(x)-F1(x)| ! 21 Cancelations Dont Arise! coordin)ate-wise perturbation) xcx: x is obtained from x by re-sampling with prob . ran)dom perturbation) xrx: x is obtained from x as px+qy for p=(1-), p2+q2=1 Cancelation: Px c x[|F>1(x)-F>1(x)|2 0.1 ] 0.1 By linearity testing:

Px r x[|F>1(x)-F>1(x)|2 0.01 ] 0.99 Well show that they cannot co-exist! 22 Cancelations Dont Arise! x x: x is obtained coordin)ate-wise perturbation) c Claim: For any G:RnR, from x by re-sampling with prob . 2 2 E |G(x)-G(x)|

E |G(x)-G(x)| x r xperturbation) x x: xx cis x obtained from.x as ran)dom r px+qy for p=(1-), p2+q2=1 Cancelation: Px c x[|F>1(x)-F>1(x)|2 0.1 ] 0.1 By linearity testing: Px r x[|F>1(x)-F>1(x)|2 0.01 ] 0.99 Well show that they cannot co-exist! 23 Main Lemma: From Small

Difference to Constant Difference Main) Lemma: If for H:RnR, Px c x [|H(x)-H(x)|2 AT] 10, Px r x [|H(x)-H(x)|2 T] 1-, Then, there is Boolean) B:Rn {0,1} where Px c x [|B(x)-B(x)|2 = 1] 0.01, Px r x [|B(x)-B(x)|2 = 0] 1-1/A, Proof: Combinatorial, by considering perturbation graphs, and constructing an appropriate cut. 24 Thank you! 25 Proof of Main Lemma

Perturbation) graphs: Gc = (Rn,Ec) (x,x)2Ec if: Gr = (Rn,Er) (x,x)2Er if: |H(x)|,|H(x)|1/2, |H(x)-H(x)| 0.1. |H(x)|,|H(x)|1/2, |H(x)-H(x)| 0.01. Edge weight is prob x c x. Edge weight is prob x r x. Our task: Find a cut in CRn that cuts:

Ec edges of weight 0.09. Er edges of weight 0.02. 26 Cutting The Gaussian Space Claim: There is a distribution D over cuts s.t.: Every edge e2Ec is cut with prob 0.1. Every edge e2Er is cut with prob 0.01. -1/2 H(x) H(x) 1/2 27

From Small Cuts To Large Cuts In previous lemma, cut only weight. Want to cut constant weight. 28 Larger Cut Let m=11/. Sample from D cuts C1,,Cm. Pick random I[m]. Let C = i2I Ci. Claim: Edge e2Ec is cut with prob 0.5(1-1/e2). Edge e2Er is cut with prob 0.01 10/ = 0.1. Corr: A cut with 0.20.1 weight of Ec and 0.80.99 weight of E . 29