380C Where are we & where we are

380C  Where are we & where we are

380C Where are we & where we are going Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap a lot? Why you need to care about workloads

Alias analysis Dependence analysis Loop transformations EDGE architectures 1 380C lecture 18 Garbage Collection Why use garbage collection? What is garbage? Reachable vs live, stack maps, etc. Allocators and their collection mechanisms Semispace Marksweep Performance comparisons

Mark Region Incremental age based collection Write barriers: Friend or foe? Generational Beltway 2 Mark Region and Other Advances in Garbage Collection PLDI08: Immix: A Mark-Region Collector With Space Efficiency, Fast Collection, and Mutator Performance Kathryn S. McKinley University of Texas at Austin Stephen M. Blackburn Australian National University

Isnt GC Marka bit retro? Mark-Sweep Compact Semi-Space McCarthy, 1960 Styger, 1967 Cheney, 1970 Languages without automated garbage collection are getting out of fashion. The chance of running into all kinds of memory problems is gradually outweighing the performance penalty you have to pay for garbage collection. Paul Paul Jansen,

Jansen, managing managing director director of of TIOBE TIOBE Software, Software, in in Dr Dr Dobbs, Dobbs, April April 2008 2008 4 GC Fundamentals The TimeSpace Tradeoff 5 GC Fundamentals The TimeSpace Tradeoff

Our Goal 6 GC Fundamentals Algorithmic Components Allocation Identification Reclamation Sweep-to-Free Free List Tracing (implicit) Compact

Bump Allocation Reference Counting (explicit) 3 ` Evacuate 1 7 GC Fundamentals Canonical Garbage Collectors Mark-Sweep [McCarthy

[McCarthy 1960] 1960] Sweep-to-Free Free-list + trace + sweep-to-free Mark-Compact [Styger [Styger 1967] 1967] Bump allocation + trace + compact Compact ` Evacuate Semi-Space

[Cheney [Cheney 1970] 1970] Bump allocation + trace + evacuate 8 Mark-Sweep Free List Allocation + Trace + Sweep-to-Free Space Space efficient efficient Poor Poor locality locality

Simple, Simple, very very fast fast collection collection Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 9 Mark-Compact Bump Allocation + Trace + Compact Good

Good locality locality Space Space efficient efficient Expensive Expensive multi-pass multi-pass collection collection

Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 10 Semi-Space Bump Allocation + Trace + Evacuation Good Good locality locality Space Space inefficient inefficient Space

Space inefficient inefficient Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 11 Mark-Region with Sweep-To-Region Reclamation Mark-Sweep Sweep-to-Free Free-list + trace + sweep-to-free Mark-Compact Compact

Bump allocation + trace + compact Semi-Space Evacuate ` Bump allocation + trace + evacuate Mark-Region Bump + trace + sweep-to-region Sweep-to-Region 12 Mark-Region

Bump Allocation + Trace + Sweep-to-Region Good Good locality locality Simple, Simple, very very fast fast collection collection Space

Space efficient efficient Excellent Excellent performance performance Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 13 Nave Mark-Region 0 Contiguous allocation into regions Excellent locality

For simplicity, objects cannot span regions Simple mark phase (like mark-sweep) Mark objects and their containing region Unmarked regions can be freed 14 Immix Efficient Mark-Region Garbage Collection 15 Lines and Blocks Large Regions More contiguous allocation

Fragmentation (false marking) Small Regions Fragmentation (cant fill blocks) N pages Free Increased metadata o/h Constrained object sizes Lines & Blocks Recyclable lines TLB locality, cache locality Objects span lines Less fragmentation Free

approx 1 cache line 0 lines Recyclable Block > 4 X max object size Lines marked with objects Fast common case 16 Allocation Policy (Recycling) Recycle partially marked blocks first Minimizes fragmentation Maximizes sharing of freed blocks Recycle in address order We explored other options

Allocate into free blocks last 17 Opportunistic Defragmentation Opportunistically evacuate fragmented blocks Lightweight, uses same allocation mechanism No cost in common case (specialized GC) 0 Identify source and target blocks (see paper for heuristics) Evacuate objects in source blocks Allocate into target blocks Opportunistic Leave in place if no space, or object pinned

18 Other Optimizations Implicit Marking Small objects implicitly mark next line Large objects mark lines exactly Most Line objects mark small V. Fast common case Implicit line mark Overflow Allocation Multi-line objects may skip many small holes

Overflow allocation (used on failure) Large objects uncommonV. effective solution 19 Results Complete data available at: http://cs.anu.edu.au/~Steve.Blackburn/pubs 20 Evaluation 20 Benchmarks Collectors DaCapo SPECjvm98

SPEC jbb2000 Full Heap Immix MarkSweep MarkCompact SemiSpace Generational GenIX GenMS GenCopy Sticky StickyIX StickyMS Methodology MMTk Jikes RVM 2.9.3 (Perf HotSpot 1.5)

Replay compiler Discard outliers Report 95th %ile Hardware Core 2 Duo 2.4GHz, 32KB L1, 4MB L2, 2GB RAM AMD Athlon 3500+ 2.2GHz, 64KB L1, 512KB `L2, 2GB RAM PowerPC 970 1.6GHz, 32KB L1, 512KB L2, 2GB RAM Please see the paper for details.

21 Mutator Time Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 22 Minimum Heap 23 GC Time Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 24 Total Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 25

Generational Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 26 Sticky Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 27 PseudoJBB 2000 On 2.4GHz Core 2 Duo 28 PseudoJBB 2000 On 2.4GHz Core 2 Duo 29

Prior Work http://www.ibm.com/developerworks/ibm/library/igarbage1/ IBM product collector Mark-Region not characterized Collector not evaluated Product and basis for other research [Domani et al 2000][Kermany & Petrank 2006] 30 Mark-Region Collection Mark-Sweep Sweep-to-Free Free-list + trace + sweep-to-free Compact

Mark-Compact Bump allocation + trace + compact Evacuate ` Semi-Space Bump allocation + trace + evacuate Sweep-to-Region Mark-Region Bump allocation + trace + sweep-to-region 31 Immix Efficient Mark-Region Collection

Good Good locality locality Simple, Simple, very very fast fast collection collection Space Space

efficient efficient Excellent Excellent performance performance Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 32 Open Source Code available in JikesRVM 2.9.3 onward. http://www.jikesrvm.org Complete data available at:

http://cs.anu.edu.au/~Steve.Blackburn/pubs 33 Research History PLDI 1998 Clinger & Hanson postulated the radioactive decay model for object lifetimes Genesis of Older-First [Stefanovic, McKinley, Moss OOPSLA99] 34 Garbage Collection Hypotheses Generational hypothesis: younger objects die quickly, so collect them first Older-first hypothesis: the collector can

collect less the longer it waits Age ordered heap Survival function s(v) for object lifetime distribution s(v) younger 0 older 1/2V V 35

Older-first Algorithm 36 Next Steps Beltway [BJMM PLDI02] 0 1 3 4 5

6 7 8 9 10 33 34 35 36 37

38 39 40 Increments Belts Combines generational and older-first Ulterior Reference Counting [BM OOPSLA03] Reference count on-per-object basis Responsiveness and throughput MMTk: [BCM SIGMETRICS04 ICSE04] Toolkit for building & understanding GC Motivated todays work 37

Garbage Collection is the Answer to All Your Problems Improves data and code locality [Huang et al. OOPSLA02 ISMM04, VEE04] Cooperative GC optimizations Colocation [Guyer OOPSLA05] Free-me [Guyer et al. PLDI06] Finds leaks [Bond ASPLOS06, Jump POPL07] Tolerates leaks [Bond OOSLA08] Helps with dynamic software updating! [Subramaniam, Hicks ??08]

DaCapo Benchmarks [Blackburn et al. OOPSLA06 CACM08] 38 380C Where are we & where we are going Why you need to care about workloads Managed languages Dynamic compilation Inlining Garbage collection Opportunity to improve data locality on-the-fly Read: X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng, The Garbage Collection Advantage: Improving Program Locality, ACM Conference on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 69-80, Vancouver, Canada, October 2004.

Alias analysis Dependence analysis Loop transformations EDGE architectures

Recently Viewed Presentations

  • May 16- May 31 WHAT'S INSIDE... Direct Tax

    May 16- May 31 WHAT'S INSIDE... Direct Tax

    Luxottica India Eyewear Pvt. Ltd. ("the taxpayer") is a part of Luxottica group which is engaged in design, manufacture and distribution of sun glasses and prescription frames in mid and premium price categories.
  • Economics and Law - University of Cambridge

    Economics and Law - University of Cambridge

    Economics and Law CST part 1b Ross Anderson, Richard Clayton Philosophies of ethics Authority theories mostly derive from religion. But God usually talks via scriptures or a priesthood; so how do you resolve disputes?
  • Chapter 26 Bankruptcy, Workouts, and Corporate Reorganization

    Chapter 26 Bankruptcy, Workouts, and Corporate Reorganization

    Doctrine. XYZ Corporation has just completed the sale of all of its assets for $1 million. Following the absolute priority of claims doctrine, show how these proceeds should be distributed to the creditors and owners of XYZ Corporation.
  • Analysis of Algorithms

    Analysis of Algorithms

    Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is a pair (key, value) Main methods of the Priority Queue ADT insert(k, x) inserts an entry with key k and value x removeMin() removes...
  • Profitable Re-utilization of Floating Plastics in The Indian ...

    Profitable Re-utilization of Floating Plastics in The Indian ...

    Brick with e-Waste contents Other shapes/sizes of Timber /Wood & Brick substitutes are Boards/ Panels/legs/ support stands & Slabs/ ledges/ blocks /side beam or border stone as well as Marble/ Granite slabs. ... WIPO) European Patent Office/EPO has issued a...
  • Unit 9 Review - DavidChapman.Org

    Unit 9 Review - DavidChapman.Org

    Civil Rights Movement. More direct action in the 1950s and 60s than previously. Non-violent but direct confrontation under MLK, CORE, SCLC, SNCC. More militant and prone to violence under Stokely Carmichael (SNCC, Black Power) and Malcolm X (Black Muslims or...
  • What Keeps You Awake at Night?Risk Managing 1

    What Keeps You Awake at Night?Risk Managing 1

    Definition: Broadly defined, risk management includes any activity, process or policy to reduce liability exposure. From both a client safety and a financial perspective, it is vital that providers conduct risk management activities aimed at preventing harm to clients and...
  • Chapter 1

    Chapter 1

    The 10 radix (base) For any radix r, the value of a number with n-digit integer part and m-digit fraction part, can be expressed: The digit value The weight given by the ri , where i represent the position of...