380C Where are we & where we are

380C  Where are we & where we are

380C Where are we & where we are going Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap a lot? Why you need to care about workloads

Alias analysis Dependence analysis Loop transformations EDGE architectures 1 380C lecture 18 Garbage Collection Why use garbage collection? What is garbage? Reachable vs live, stack maps, etc. Allocators and their collection mechanisms Semispace Marksweep Performance comparisons

Mark Region Incremental age based collection Write barriers: Friend or foe? Generational Beltway 2 Mark Region and Other Advances in Garbage Collection PLDI08: Immix: A Mark-Region Collector With Space Efficiency, Fast Collection, and Mutator Performance Kathryn S. McKinley University of Texas at Austin Stephen M. Blackburn Australian National University

Isnt GC Marka bit retro? Mark-Sweep Compact Semi-Space McCarthy, 1960 Styger, 1967 Cheney, 1970 Languages without automated garbage collection are getting out of fashion. The chance of running into all kinds of memory problems is gradually outweighing the performance penalty you have to pay for garbage collection. Paul Paul Jansen,

Jansen, managing managing director director of of TIOBE TIOBE Software, Software, in in Dr Dr Dobbs, Dobbs, April April 2008 2008 4 GC Fundamentals The TimeSpace Tradeoff 5 GC Fundamentals The TimeSpace Tradeoff

Our Goal 6 GC Fundamentals Algorithmic Components Allocation Identification Reclamation Sweep-to-Free Free List Tracing (implicit) Compact

Bump Allocation Reference Counting (explicit) 3 ` Evacuate 1 7 GC Fundamentals Canonical Garbage Collectors Mark-Sweep [McCarthy

[McCarthy 1960] 1960] Sweep-to-Free Free-list + trace + sweep-to-free Mark-Compact [Styger [Styger 1967] 1967] Bump allocation + trace + compact Compact ` Evacuate Semi-Space

[Cheney [Cheney 1970] 1970] Bump allocation + trace + evacuate 8 Mark-Sweep Free List Allocation + Trace + Sweep-to-Free Space Space efficient efficient Poor Poor locality locality

Simple, Simple, very very fast fast collection collection Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 9 Mark-Compact Bump Allocation + Trace + Compact Good

Good locality locality Space Space efficient efficient Expensive Expensive multi-pass multi-pass collection collection

Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 10 Semi-Space Bump Allocation + Trace + Evacuation Good Good locality locality Space Space inefficient inefficient Space

Space inefficient inefficient Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 11 Mark-Region with Sweep-To-Region Reclamation Mark-Sweep Sweep-to-Free Free-list + trace + sweep-to-free Mark-Compact Compact

Bump allocation + trace + compact Semi-Space Evacuate ` Bump allocation + trace + evacuate Mark-Region Bump + trace + sweep-to-region Sweep-to-Region 12 Mark-Region

Bump Allocation + Trace + Sweep-to-Region Good Good locality locality Simple, Simple, very very fast fast collection collection Space

Space efficient efficient Excellent Excellent performance performance Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 13 Nave Mark-Region 0 Contiguous allocation into regions Excellent locality

For simplicity, objects cannot span regions Simple mark phase (like mark-sweep) Mark objects and their containing region Unmarked regions can be freed 14 Immix Efficient Mark-Region Garbage Collection 15 Lines and Blocks Large Regions More contiguous allocation

Fragmentation (false marking) Small Regions Fragmentation (cant fill blocks) N pages Free Increased metadata o/h Constrained object sizes Lines & Blocks Recyclable lines TLB locality, cache locality Objects span lines Less fragmentation Free

approx 1 cache line 0 lines Recyclable Block > 4 X max object size Lines marked with objects Fast common case 16 Allocation Policy (Recycling) Recycle partially marked blocks first Minimizes fragmentation Maximizes sharing of freed blocks Recycle in address order We explored other options

Allocate into free blocks last 17 Opportunistic Defragmentation Opportunistically evacuate fragmented blocks Lightweight, uses same allocation mechanism No cost in common case (specialized GC) 0 Identify source and target blocks (see paper for heuristics) Evacuate objects in source blocks Allocate into target blocks Opportunistic Leave in place if no space, or object pinned

18 Other Optimizations Implicit Marking Small objects implicitly mark next line Large objects mark lines exactly Most Line objects mark small V. Fast common case Implicit line mark Overflow Allocation Multi-line objects may skip many small holes

Overflow allocation (used on failure) Large objects uncommonV. effective solution 19 Results Complete data available at: http://cs.anu.edu.au/~Steve.Blackburn/pubs 20 Evaluation 20 Benchmarks Collectors DaCapo SPECjvm98

SPEC jbb2000 Full Heap Immix MarkSweep MarkCompact SemiSpace Generational GenIX GenMS GenCopy Sticky StickyIX StickyMS Methodology MMTk Jikes RVM 2.9.3 (Perf HotSpot 1.5)

Replay compiler Discard outliers Report 95th %ile Hardware Core 2 Duo 2.4GHz, 32KB L1, 4MB L2, 2GB RAM AMD Athlon 3500+ 2.2GHz, 64KB L1, 512KB `L2, 2GB RAM PowerPC 970 1.6GHz, 32KB L1, 512KB L2, 2GB RAM Please see the paper for details.

21 Mutator Time Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 22 Minimum Heap 23 GC Time Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 24 Total Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 25

Generational Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 26 Sticky Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 27 PseudoJBB 2000 On 2.4GHz Core 2 Duo 28 PseudoJBB 2000 On 2.4GHz Core 2 Duo 29

Prior Work http://www.ibm.com/developerworks/ibm/library/igarbage1/ IBM product collector Mark-Region not characterized Collector not evaluated Product and basis for other research [Domani et al 2000][Kermany & Petrank 2006] 30 Mark-Region Collection Mark-Sweep Sweep-to-Free Free-list + trace + sweep-to-free Compact

Mark-Compact Bump allocation + trace + compact Evacuate ` Semi-Space Bump allocation + trace + evacuate Sweep-to-Region Mark-Region Bump allocation + trace + sweep-to-region 31 Immix Efficient Mark-Region Collection

Good Good locality locality Simple, Simple, very very fast fast collection collection Space Space

efficient efficient Excellent Excellent performance performance Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 32 Open Source Code available in JikesRVM 2.9.3 onward. http://www.jikesrvm.org Complete data available at:

http://cs.anu.edu.au/~Steve.Blackburn/pubs 33 Research History PLDI 1998 Clinger & Hanson postulated the radioactive decay model for object lifetimes Genesis of Older-First [Stefanovic, McKinley, Moss OOPSLA99] 34 Garbage Collection Hypotheses Generational hypothesis: younger objects die quickly, so collect them first Older-first hypothesis: the collector can

collect less the longer it waits Age ordered heap Survival function s(v) for object lifetime distribution s(v) younger 0 older 1/2V V 35

Older-first Algorithm 36 Next Steps Beltway [BJMM PLDI02] 0 1 3 4 5

6 7 8 9 10 33 34 35 36 37

38 39 40 Increments Belts Combines generational and older-first Ulterior Reference Counting [BM OOPSLA03] Reference count on-per-object basis Responsiveness and throughput MMTk: [BCM SIGMETRICS04 ICSE04] Toolkit for building & understanding GC Motivated todays work 37

Garbage Collection is the Answer to All Your Problems Improves data and code locality [Huang et al. OOPSLA02 ISMM04, VEE04] Cooperative GC optimizations Colocation [Guyer OOPSLA05] Free-me [Guyer et al. PLDI06] Finds leaks [Bond ASPLOS06, Jump POPL07] Tolerates leaks [Bond OOSLA08] Helps with dynamic software updating! [Subramaniam, Hicks ??08]

DaCapo Benchmarks [Blackburn et al. OOPSLA06 CACM08] 38 380C Where are we & where we are going Why you need to care about workloads Managed languages Dynamic compilation Inlining Garbage collection Opportunity to improve data locality on-the-fly Read: X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng, The Garbage Collection Advantage: Improving Program Locality, ACM Conference on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 69-80, Vancouver, Canada, October 2004.

Alias analysis Dependence analysis Loop transformations EDGE architectures

Recently Viewed Presentations

  • PowerPoint-Präsentation

    PowerPoint-Präsentation

    We are proud to present you the winners of our contest: SAFETY AT WORK And the winner is … The city council maintenance team ! And finally, a special prize to : The shooting gallery assistant * * * *...
  • Valuation 5e - Chapter 2 - Finance Department

    Valuation 5e - Chapter 2 - Finance Department

    value driver (KVD) formula. The key value driver formula is superior to alternative methodologies because it is cash flow based and links cash flow to growth and ROIC. Economic-profit model. The economic-profit model leads to results consistent with the KVD...
  • Teleskop - Gimnazija Šentvid

    Teleskop - Gimnazija Šentvid

    1990 z Vesoljskim čolničkom Discovery reflektor tipa Cassegrain, zrcalo premera 240 cm 569 km nad tlemi, obhod v 97 minutah (8 km/s) več inštrumentov, ki merijo v različnih valovnih dolžinah sprva neostre slike zaradi sferične aberacije glavnega zrcala od popravila...
  • Safety and Health Programs Slide Presentation

    Safety and Health Programs Slide Presentation

    Provide training to all managers, supervisors and workers as well as contractors and temporary workers on: safety policies and procedures, program functions, emergencies, injury illness reporting, and their rights under the OSH Act. Ensure the training is provided in a...
  • Calculating Ksp

    Calculating Ksp

    Calculate solubility of a salt in the presence of a common ion: Estimate the solubility of Ag 2 SO 4 in a solution containing 0.10M Na 2 SO 4. Ksp= 1.2x10-5(similar process to last problem except that the concentration of...
  • Global Change Information System(GCIS)

    Global Change Information System(GCIS)

    Global Change Information System(GCIS) Long Term Vision: The Global Change Information System (GCIS) is intended to eventually become a unified web based source of authoritative, accessible, usable and timely information about climate and global change for use by scientists, decision...
  • Leaves - Ppt

    Leaves - Ppt

    Specialized or Modified Leaves. In pine trees, the leaves are adapted to living in a dry environment too. Water is locked up as ice during significant portions of the year and therefore not available to the plant; pine leaves possess...
  • AMERICAS ARMY: THE STRENGTH OF THE NATION CBA

    AMERICAS ARMY: THE STRENGTH OF THE NATION CBA

    The garage cost $15 per month to rent. After the car was assembled, it was clear that it would not be able to fit through the door of the rented garage. The car was to be used as a model...