# Static Program Analysis via Three-Valued Logic Static Program Analysis via Three-Valued Logic Mooly Sagiv (Tel Aviv), Thomas Reps (Madison), Reinhard Wilhelm (Saarbrcken) Full Employment for Verification Experts Analysis must track numeric information A sailor on the U.S.S. Yorktown entered a 0 into a data field in a kitchen-inventory program. That caused the database to overflow and crash all LAN consoles and miniature remote terminal units. The Yorktown was dead in the water for about two hours and 45 minutes. need to track values x = 3; y = 1/(x-3); other than 0 x = 3; need to track px = &x; pointers y = 1/(*px-3); x = 3;

need to track p = (int*)malloc(sizeof int); heap-allocated *p = x; q = p; storage y = 1/(*q-3); a = &e 1 b = a 2 c = &f 3 *b = c 4 d = *a 5 a b c d a b c d a

b c d a b c d a b c d a b c d Flow-Sensitive Points-To Analysis e f e f p q p = &q; p q

e p f e f p e r1 r2 s1 s2 s3 r2 r1 s1 r2 s2 r1

q p = q; p q p = *q; p r1 r2 q s1 s2 s3 r2 r1 s1 r2 s2 r1 q f e f

p q *p = q; p q a = &e 1 b = a Flow-Insensitive Points-To Analysis 2 c = &f 3 *b = c 4 d = *a 5 2 a b c

d 1 b = a 2 1 3 a = &e 5 c = &f 3 4 *b = c e f 4 d = *a 5 a

b c d a b c d a b c d a b c d a b c d a b c d e f e f e f e

f e f e f Shape Analysis [Jones and Muchnick 1981] Characterize dynamically allocated data Identify may-alias relationships x points to an acyclic list, cyclic list, tree, dag, disjointedness properties x and y point to structures that do not share cells show that data-structure invariants hold Account for destructive updates through pointers Applications: Software Tools Static detection of memory errors dereferencing NULL pointers dereferencing dangling pointers memory leaks Static detection of logical errors Is a data-structure invariant restored? Applications: Code Optimization Parallelization

Operate in parallel on disjoint structures Software prefetching Compile-time garbage collection Insert storage-reclamation operations Eliminate or move checking code Why is Shape Analysis Difficult? Destructive updating through pointers p next = q Produces complicated aliasing relationships Dynamic storage allocation No bound on the size of run-time data structures Data-structure invariants typically only hold at the beginning and end of operations Want to verify that data-structure invariants are re-established Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t NULL y 1 2 x 3 NULL Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y;

y = x; x = x next; y next = t; } return y; } t y x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t

y NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL x

Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List;

List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL x Materialization Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) {

t = y; y = x; x = x next; y next = t; } return y; } t y NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;

} t y NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y

NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL x Example: In-Situ List Reversal typedef struct list_cell {

int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL;

while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; }

return y; } t y x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y x

Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t;

y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;

} t y x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y x

Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL;

while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;

} t y NULL x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL

x Example: In-Situ List Reversal typedef struct list_cell { int val; struct list_cell *next; } *List; List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } t y NULL x Idea for a List Abstraction t y x

t y x t NULL y x represents t y x NULL NULL NULL t x != NULL y t

y NULL NULL NULL x t t=y x y t y NULL t y NULL t x = xnext

t y x t y x y t y x x t x y NULL t y

x t x y t y x NULL return y y NULL NULL ynext = t x t NULL

x y=x y t NULL x x x Properties of reverse(x) On entry, x points to an acyclic list On each iteration, x & y point to disjoint acyclic lists All the pointer dereferences are safe No memory leaks On exit, y points to an acyclic list On exit, x = = NULL All cells reachable from y on exit were reachable from x on entry, and vice versa On exit, the order between neighbors in the ylist A Yacc for Shape Analysis: TVLA

Parametric framework Some instantiations known analyses Other instantiations new analyses Applications beyond shape analysis Partial correctness of sorting algorithms Safety of mobile code Deadlock detection in multi-threaded programs Partial correctness of mark-and-sweep gc alg. Correct usage of Java iterators A Yacc for Static Analysis: TVLA Parametric framework Some instantiations known analyses Other instantiations new analyses Applications beyond shape analysis Partial correctness of sorting algorithms Safety of mobile code Deadlock detection in multi-threaded programs Partial correctness of mark-and-sweep gc alg.

Correct usage of Java iterators Formalizing . . . Informal: x Formal: x Summary node Using Relations to Represent Linked Lists Relation Intended Meaning x(v) Does pointer variable x point to cell v? Does pointer variable y point to cell v? Does pointer variable t point to cell v? Does the n field of v1 point to v2?

y(v) t(v) n(v1,v2) Using Relations to Represent Linked Lists x y u1 u2 u3 u4 x(u) 1 0 0 0 u1 u2 y(u) t(u) 1 0 0 0 0

0 0 0 u3 u4 n u1 u2 u3 u4 u1 0 0 0 0 u2 1 0 0 0 u3 0 1 0 0

u4 0 0 1 0 Formulas: Queries for Observing Properties Are x and y pointer aliases? v: x(v) y(v) Are x and y Pointer Aliases? v: x(v) y(v) 1 x y u1 u2 u3 u4 x(u) 1 0 0 0 u1

u2 y(u) t(u) 1 0 0 0 0 0 0 0 u3 Yes u4 n u1 u2 u3 u4 u1 0 0 0 0 u2

1 0 0 0 u3 0 1 0 0 u4 0 0 1 0 Predicate-Update Formulas for y = x x(v) = x(v) y(v) = x(v) t(v) = t(v) n(v1,v2) = n(v1,v2) Predicate-Update Formulas for y = x y(v) = x(v) x y u1

u2 u3 u4 x(u) 1 0 0 0 u1 u2 y(u) t(u) 0 0 0 0 0 0 0 0 u3 u4 n u1 u2

u3 u4 u1 0 0 0 0 u2 1 0 0 0 u3 0 1 0 0 u4 0 0 1 0 Predicate-Update Formulas for x = x n x(v) = v1: x(v1) n(v1,v)

y(v) = y(v) t(v) = t(v) n(v1, v2) = n(v1, v2) Predicate-Update Formulas for x = x n x(v) = v1: x(v1) n(v1,v) x y u1 u2 u3 u4 x(u) 1 0 0 0 u1 u2 y(u) t(u) 1 0 0 0 0

0 0 0 u3 u4 n u1 u2 u3 u4 u1 0 0 0 0 u2 1 0 0 0 u3 0 1 0 0

u4 0 0 1 0 Why is Shape Analysis Difficult? Destructive updating through pointers pnext = q Produces complicated aliasing relationships Dynamic storage allocation No bound on the size of run-time data structures Data-structure invariants typically only hold at the beginning and end of operations Need to verify that data-structure invariants are re-established Two- vs. Three-Valued Logic Two-valued logic 0 1 Three-valued logic

{0,1} {0} {0} {0,1} {1} {1} Two- vs. Three-Valued Logic Two-valued logic 1 0 1 1 0 0 0 0 1 0 1 1 1 0 1 0 Three-valued logic {1} {0,1} {0}

{1} {1} {0,1} {0} {0,1} {0,1} {0,1} {0} {0} {0} {0} {0} {1} {1} {1} {0,1} {1} {0} {1} {0,1} {1} {0,1} {0,1} {0} {1} {0,1} {0} Two- vs. Three-Valued Logic Two-valued logic 0 1 Three-valued logic

{0,1} {0} {1} Two- vs. Three-Valued Logic Two-valued logic 0 Three-valued logic 1 0 1 0 1 Boolean Connectives [Kleene] 0 1/2 1

0 1/2 1 0 0 0 0 1/2 0 1/2 1/2 1 0 1/2 1 0 0 1/2 1 1/2 1/2 1/2 1 1 1

1 1 Canonical Abstraction x u2 u1 u1 u2 u3 u4 n u1 u2 u3 u4 u1 0 0 0 0 u3 x(u) y(u) 1 0

0 0 0 0 0 0 u2 1 0 0 0 u3 0 1 0 0 u4 0 0 1 0 u4

x u1 u1 u234 u234 x(u) y(u) 1 0 0 0 n u1 u234 u1 0 u234 0 1/2 Canonical Abstraction x u2 u1 u1 u2 u3 u4

n u1 u2 u3 u4 u1 0 0 0 0 u3 x(u) y(u) 1 0 0 0 0 0 0 0 u2 1 0 0 0

u3 0 1 0 0 u4 0 0 1 0 u4 x u1 u1 u234 u234 x(u) y(u) 1 0 0 0 n u1 u234 u1 0 u234 0 1/2

Canonical Abstraction x u2 u1 u1 u2 u3 u4 = u1 u2 u3 u4 u1 1 0 0 0 u3 u4 x(u) y(u) 1 0 0

0 0 0 0 0 u2 0 1 0 0 u3 0 0 1 0 u4 0 0 0 1 x u1 u1 u234

u234 x(u) y(u) 1 0 0 0 = u1 u234 u1 1 u234 0 1/2 Canonical Abstraction Partition the individuals into equivalence classes based on the values of their unary predicates Collapse other predicates via Property-Extraction Principle Questions about a family of two-valued stores can be answered conservatively by evaluating a formula in a three-valued store Formula evaluates to 1 formula holds in every store in the family

Formula evaluates to 0 formula does not hold in any store in the family Formula evaluates to 1/2 formula may hold in some; not hold in others Are x and y Pointer Aliases? x y u1 u v: x(v) y(v) 1 Yes Is Cell u Heap-Shared? Yes u v1,v2: n(v1,u) n(v2,u) v1 v2 1

1 1 1 Is Cell u Heap-Shared? x y Maybe u u1 v1,v2: n(v1,u) n(v2,u) v1 v2 1/2 1/2 1/2 1

The Embedding Theorem x No y v: x(v) y(v) No No Maybe u1 y u1 y u4 u3 u2 x u2 x u1 x u34

u234 y u1234 Embedding x x u1 u2 x u456 u123 u3 u12 u4 u34 u5 u56

u6 Canonical Abstraction: An Embedding Whose Result is of Bounded Size x u2 u1 u1 u2 u3 u4 n u1 u2 u3 u4 u1 0 0 0 0 u3 x(u) y(u) 1

0 0 0 0 0 0 0 u2 1 0 0 0 u3 0 1 0 0 u4 0 0 1 0 u4 x u1

u234 u1 u234 x(u) y(u) 1 0 0 0 n u1 u234 u1 0 u234 0 1/2 t x != NULL n u1 u u1 0 1/2 u 0 1/2 x(u) y(u) t(u) u1 1 0 0 u 0 0 0 y

NULL x t t=y y x u1 NULL u x t y=x y y(v) = x(v) NULL x x = xnext

ynext = t return y n u1 u u1 0 1/2 u 0 1/2 x(u) y(u) t(u) u1 1 1 0 u 0 0 0 x y u1 u t x != NULL y t

y NULL NULL NULL x t t=y x y t y NULL t y NULL t x = xnext

t y x t y x y t y x x t x y NULL t

y x t x y t y x NULL return y y NULL NULL ynext = t x t

NULL x y=x y t NULL x x x How Are We Doing? Conservative Convenient But not very precise Advancing a pointer down a list loses precision Cannot distinguish an acyclic list from a cyclic list

The Instrumentation Principle Increase precision by storing the truth-value of some chosen formulas Is Cell u Heap-Shared? u v1,v2: n(v1,u) n(v2,u) v1 v2 Example: Heap Sharing is(v) = v1,v2: n(v1,v) n(v2,v) v1 v2 x 31 71 91 is = 0 is = 0 is = 0 is = 0

x u1 is = 0 u is = 0 Example: Heap Sharing is(v) = v1,v2: n(v1,v) n(v2,v) v1 v2 x 31 71 91 is = 0 is = 10 is = 0 is = 0 x u1

u is = 0 is = 1 is = 0 Example: Cyclicity c(v) = v1: n(v,v1) n*(v1,v) x 31 71 91 c=0 c == 10 is c=1 c=1 x

u1 c=0 u c=1 Is Cell u Heap-Shared? is = 0 x y No! is = 0 u u1 v1,v2: n(v1,u) n(v2,u) v1 v2 1/2 1/2 1/2

1 Maybe Formalizing . Informal: x y Formal: x y . . Formalizing . Informal: t1 x y Formal: t2 x t1

y t2 . . Formalizing . . . Informal: x y Formal: reachable from variable x reachable from variable y x r[x] r[x] r[y] r[y] y

Formalizing . Informal: . . t1 x y t2 Formal: x r[x] r[x] r[y] r[y] t1 r[x],r[t1] r[x],r[t1] r[y],r[t2] r[y],r[t2]

y t2 Useful Instrumentation Predicates doubly-linked(v) reachable-from-variable-x(v) acyclic-along-dimension-d(v) tree(v) dag(v) AVL trees: Need FO + TC balanced(v), left-heavy(v), right-heavy(v) . . . but not via height arithmetic Materialization Formal: x y x x = xn y Informal: x

y x x = xn y Formal: x y x x = xn y Nave Transformer (x = x n) x y Evaluate update formulas x y Best Transformer (x = x n) x y x

y x y ... Evaluate update formulas x y y x ... x y y x Focus-Based Transformer (x = x n) x

y x y Focus(x n) x y Partial Evaluate update formulas x x y y x y y x Why is Shape Analysis Difficult? Destructive updating through pointers

pnext = q Produces complicated aliasing relationships Track aliasing using 3-valued structures Dynamic storage allocation No bound on the size of run-time data structures Canonical abstraction finite-sized 3-valued structures Data-structure invariants typically only hold at the beginning and end of operations Need to verify that data-structure invariants are reestablished Query the 3-valued structures that arise at the exit What to Take Away A yacc for static analysis based on logic Broad scope of potential applicability Not just linkage properties: predicates are not restricted to be links! Discrete systems in which a relational (+ numeric) structure evolves Transition: evolution of relational + numeric state Canonical Abstraction A family of abstractions for use in logic Questions?