CIS 262 Automata, Computability, and Complexity Spring 2019 http://www.seas.upenn.edu/~cse262/ Instructor: Aaron Roth [email protected] Lecture: April 15, 2019 Recap: The class NP A language L is in NP if there exists a polynomial-time nondeterministic Turing machine that accepts L Equivalently, a language L is in NP if there exists a polynomial-time deterministic TM V such that w is in L iff V accepts for some c Examples: Clique, Hamiltonian Path, Graph coloring Every problem in P is in NP Every problem in NP is in ExpTime (deterministic exponential time) 2

The class EXPTIME A language L belongs to EXPTIME if there exists a deterministic TM M for L with time complexity of M is O ( exp ( nk )) Claim: If a language L is in NP, then it is in EXPTIME Proof: Suppose L is in NP. Consider a verifier V for L: Time complexity of V is f(n) and V accepts for some c if and only if w is in L Deterministic TM M for L: Given input w of length n, For each string c of length f(n), Run V on , If V accepts, stop and accept, else continue Claim: M accepts L and time complexity of M is O( exp (f(n))) Exercise: if Complement of L is in NP then L is in EXPTIME 3 Problem Classification DECIDABLE NP

P REGULAR EXPTIME 4 Boolean Formulas Boolean formula (also called propositional formula) is an expression constructed from Boolean variables: x, y, z, Logical connectives: And (&), Or (|), Not (~), Implies () Example formulas : ( x & y ) | (~x & ~y) [x&y]x ( x & ~y & ~z) | ( ~x & y & ~z ) | ( ~x & ~y & z ) Boolean formulas appear in many contexts: formalizing puzzles, structure of logical arguments, Digital circuits constructed out of AND, OR, and NOT gates are really

another representation of Boolean formulas 5 Boolean Formulas: Semantics Truth assignment r : Assignment of 0/1 values to variables r(j) : the value, either 0 or 1, of formula j under truth assignment r Rules for evaluating logical connectives: 1. if r(j) = 0 then r(~j) = 1 else r(~j) = 0 2. if both r(j) = 1 and r(y) = 1 then r( j & y ) = 1 else r( j & y)=0 3. if either r(j) = 1 or r(y) = 1 then r( j | y ) = 1 else r( j | y ) =0 4. if either r(j) = 0 or r(y) = 1 then r( j y ) = 1 else r( j y ) =0 Note: j y is same as ~j | y Note: j | y is same as ~(~j & ~y ) 6 Example Evaluation

Example formula j : ( x & y ) | (~x & ~y) Truth assignment r : Assignment of 0/1 values to variables r(j) : the value, either 0 or 1, of formula j under truth assignment r x 0 0 1 1 y 0 1 0 1 j 1 0

0 1 7 Boolean Formulas: Evaluation Evaluation problem: Given a formula j and a truth assignment r, find the value of j under this assignment EVAL = { | r(j) = 1 } Claim: EVAL is in P Poly-time algorithm for EVAL : obvious 8 Boolean Satisfiability: SAT A formula is satisfiable if there exists a truth assignment r such that r(j) = 1 SAT = { < j > | j is a satisfiable Boolean formula } Examples:

(x & y) | (~x & ~y) Satisfiable ( x y) & ( y z ) & x & ~z Unsatisfiable 9 SAT is in NP EVAL = { < j, r > | r(j) = 1 } Check if the given truth assignment makes the given formula true SAT = { < j > | j is a satisfiable Boolean formula } Check if there exists a truth assignment that makes given formula true Claim: SAT is in NP Proof: Algorithm for EVAL is a poly-time verifier for SAT Given a formula j with k variables, there are 2k truth assignments, and satisfiability solver needs to search through this exponential space Finding a solution to any NP problem can be encoded as finding a

satisfying truth assignment for a Boolean formula 10 Example: Encoding Graph Coloring as SAT Is this graph 3-colorable ? 1 2 4 3 5 Goal: Construct a Boolean formula whose satisfying assignments encode assignment of 3 colors to nodes

Variable xij stands for node i is assigned color j i = 1, 2, , 5 and j = 1, 2, 3 for 3 colors Desired formula is conjunction (AND) of many formulas: j1 = (x11 & ~x12 & ~x13 ) | (~x11 & x12 & ~x13 ) | (~x11 & ~x12 & x13 ) To make j1 true, exactly one of x11, x12, x13 must be set to 1 j1 thus says that node 1 should be assigned exactly one color Similarly, for i = 1, 2, 3, 4, 5, let ji = (xi1 & ~xi2 & ~xi3 ) | (~xi1 & xi2 & ~xi3 ) | (~xi1 & ~xi2 & xi3 ) 11 Example: Encoding Graph Coloring as SAT For each vertex i, consider ji =(xi1 &~xi2 &~xi3)| (~xi1 & xi2 & ~xi3 )| (~xi1& ~xi2& xi3) 1

2 3 Let j = j1 & j2 & j3 & j4 & j5 4 5 A satisfying assignment for j encodes a coloring Need constraints to ensure that nodes with an edge dont share color y12 = ~(x11 & x21) & ~(x12 & x22 ) & ~(x13 & x23) To satisfy this, for no color j, both x1j and x2j can be set to 1 More generally, let yij = ~(xi1 & xj1) & ~(xi2 & xj2 ) & ~(xi3 & xj3) This says that node i and node j cannot have same color 12

Example: Encoding Graph Coloring as SAT For each vertex i, consider ji =(xi1 &~xi2 &~xi3)| (~xi1 & xi2 & ~xi3 )| (~xi1& ~xi2& xi3) 1 2 3 Let j = j1 & j2 & j3 & j4 & j5 4 5 Let yij = ~(xi1 & xj1) & ~(xi2 & xj2 ) & ~(xi3 & xj3) And y = y12 & y13 & y23 & y24 & y35 & y45

Formula j says that each vertex has exactly one color and formula y says that vertices connected by an edge dont have same color Claim: Graph is 3-colorable if and only if j & y is satisfiable More generally, to check if a graph G with n vertices is k-colorable, we can construct a boolean formula with n.k variables, and check its satisfiability 13 Complement of class NP: CoNP UNSAT = { < j > | j is an unsatisfiable Boolean formula } To check that a formula is unsatisfiable, doesnt suffice to find one truth assignment that makes it False, rather need to show that each of the exponentially many truth assignments makes it False CoNP: Complement of class NP: A language L is in CoNP if the complement language ~L is in NP UNSAT is in CoNP, and so are { < G, k > | graph G does not have a clique of size k } { < G, s, t > | there is no Hamiltonian path from s to t in graph G } { < G, k > | graph G cannot be colored using k colors }

14 Validity of Boolean Formulas A formula j is called valid (also called a tautology) if for every truth assignment r, r(j) = 1 (i.e. formula is always true) Examples (dating back to Socrates for valid forms of logical arguments): (x & y ) xValid [ ( x y ) & x ] y Valid ( x y ) ( y x ) Invalid VALID = { < j > | j is a valid Boolean formula } What class contains VALID ? Complement: INVALID = {< j > | r(j) = 0 for some truth assignment r} INVALID is in NP and VALID is in CoNP 15 Relationship of CoNP with other Complexity Classes

CoNP P NP EXPTIME Both P and EXPTIME are closed under complement: because defined using deterministic Turing machines for complement toggle accept/reject state, doesnt change complexity It follows that P is contained in CoNP and CoNP contained in EXPTIME Also: P = NP implies P = NP = CoNP 16 Theory of NP-Completeness Cooks Theorem (1972):

SAT is NP-complete If SAT is in P then P = NP If someone can find a (deterministic) algorithm of polynomial-time complexity to solve SAT, then every problem in NP can be solved in polynomial-time Open question: to find a poly-time algorithm for SAT or to prove that no such algorithm exists Turns out that many problems in NP are equivalent to SAT, and this is the class of NP-complete problems Key to developing this theory: Polynomial-time problem reduction 17 Problem/Language Reduction A reduces to B, A < B: possible to construct a machine/algorithm for A using that for B Machine for A w

Transformation f f(w) Accept (f(w) in B) Accept (w in A) Machine for B Reject Reject (otherwise) (otherwise) To establish A < B, find a (computable) transformation f so that

w is in A if and only if f(w) is in B If A reduces to B, then: 1. If B is decidable then so is A 2. If A is undecidable, so is B 18 Polynomial-time Problem/Language Reduction A reduces to B, A < B: possible to construct a machine/algorithm for A using that for B Machine for A w Transformation f f(w)

Accept (f(w) in B) Accept (w in A) Machine for B Reject Reject (otherwise) (otherwise) This reduction is a polynomial-time reduction, written A < P B, if the transformation is computable by a polynomial-time TM/program If A

Poly-time Reduction: Graph Coloring to SAT Show how to construct poly-time algorithm for graph coloring using that for SAT Machine for COLOR G k Transformation f f(G,k) f(G,k) is satisfiable G has k-coloring

Machine for SAT No No Given graph G and k, show how to construct in polynomial-time a Boolean formula j=f(G,k) which is satisfiable if and only if G is kcolorable 20 COLOR to SAT Transformation Given graph G with n nodes and k, consider variables: xij for i = 1, , n and j = 1, , k For each vertex i, let ji = (xi1 & & ~xik) | | (~xi1 & & xik) Let j = j1 & ... & jn For each edge (i,j) in G, let yij = ~(xi1 & xj1) & & ~(xik & xjk),

and y is AND of such formulas corresponding to all edges Formula j says that each vertex has exactly one color and formula y says that vertices connected by an edge dont have same color, so set f(G,k) = j & y Claim: Graph G is k-colorable if and only if f(G,k) is satisfiable Size of formula is O(n.k), and it can be constructed easily (in poly-time) 21 Independent Set A set U of nodes is an independent set if no two nodes in U are connected by an edge 1 2 4 3

5 Example: {1, 4} is an independent set IndepSet = { | G has an indep set of size k } Which complexity class does it belong to ? Goal: Show that Clique

1 1 2 2 4 3 3 4 5 G = Complement graph of G One-to-one correspondence between clique in G and indep

set in G 5 Algorithm for Clique assuming algorithm R for IndepSet: Given undirected graph G = (V, E) and k, 1. Construct graph G = (V, E) such that for two distinct vertices u and v, (u,v) is in E iff (u,v) is not in E 2. Check if G has independent set of size k using R Correctness: Argue that (G,k) is in Clique iff (G,k) is in IndepSet Time-complexity: Poly-time assuming R is poly-time 23 Invalidity INVALID = { | there exists a truth assignment r s.t. r(j) = 0 } We know that INVALID is in NP Show that SAT

Check if R accepts ~ j : if so, accept else reject 24 Solving SAT in Practice All NP problems (which includes lots of optimization problems arising in practice) can be reduced to SAT How do we actually solve SAT ? Obvious algorithm: try out all possible truth assignments (exponential) Good news: search can terminate once one satisfying assignment is found Search heuristics: Which path to explore first (choose a variable x, and set it to 0 or 1) ? If a variable x is assigned 0, simplify the formula as much as possible 25

Modern SAT Solvers Annual competition of SAT solvers: see satcompetition.org Highly optimized implementations Example: Z3 from Microsoft Research Modern SAT solver can check satisfiability for formulas with over 100K variables Now used routinely for real-world problems Example: Generating test inputs for programs 26