380C Where are we & where we are going Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap a lot? Why you need to care about workloads
Alias analysis Dependence analysis Loop transformations EDGE architectures 1 380C lecture 18 Garbage Collection Why use garbage collection? What is garbage? Reachable vs live, stack maps, etc. Allocators and their collection mechanisms Semispace Marksweep Performance comparisons
Mark Region Incremental age based collection Write barriers: Friend or foe? Generational Beltway 2 Mark Region and Other Advances in Garbage Collection PLDI08: Immix: A Mark-Region Collector With Space Efficiency, Fast Collection, and Mutator Performance Kathryn S. McKinley University of Texas at Austin Stephen M. Blackburn Australian National University
Isnt GC Marka bit retro? Mark-Sweep Compact Semi-Space McCarthy, 1960 Styger, 1967 Cheney, 1970 Languages without automated garbage collection are getting out of fashion. The chance of running into all kinds of memory problems is gradually outweighing the performance penalty you have to pay for garbage collection. Paul Paul Jansen,
Jansen, managing managing director director of of TIOBE TIOBE Software, Software, in in Dr Dr Dobbs, Dobbs, April April 2008 2008 4 GC Fundamentals The TimeSpace Tradeoff 5 GC Fundamentals The TimeSpace Tradeoff
[Cheney [Cheney 1970] 1970] Bump allocation + trace + evacuate 8 Mark-Sweep Free List Allocation + Trace + Sweep-to-Free Space Space efficient efficient Poor Poor locality locality
Simple, Simple, very very fast fast collection collection Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 9 Mark-Compact Bump Allocation + Trace + Compact Good
Good locality locality Space Space efficient efficient Expensive Expensive multi-pass multi-pass collection collection
Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 10 Semi-Space Bump Allocation + Trace + Evacuation Good Good locality locality Space Space inefficient inefficient Space
Space inefficient inefficient Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 11 Mark-Region with Sweep-To-Region Reclamation Mark-Sweep Sweep-to-Free Free-list + trace + sweep-to-free Mark-Compact Compact
Bump Allocation + Trace + Sweep-to-Region Good Good locality locality Simple, Simple, very very fast fast collection collection Space
Space efficient efficient Excellent Excellent performance performance Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 13 Nave Mark-Region 0 Contiguous allocation into regions Excellent locality
For simplicity, objects cannot span regions Simple mark phase (like mark-sweep) Mark objects and their containing region Unmarked regions can be freed 14 Immix Efficient Mark-Region Garbage Collection 15 Lines and Blocks Large Regions More contiguous allocation
Fragmentation (false marking) Small Regions Fragmentation (cant fill blocks) N pages Free Increased metadata o/h Constrained object sizes Lines & Blocks Recyclable lines TLB locality, cache locality Objects span lines Less fragmentation Free
approx 1 cache line 0 lines Recyclable Block > 4 X max object size Lines marked with objects Fast common case 16 Allocation Policy (Recycling) Recycle partially marked blocks first Minimizes fragmentation Maximizes sharing of freed blocks Recycle in address order We explored other options
Allocate into free blocks last 17 Opportunistic Defragmentation Opportunistically evacuate fragmented blocks Lightweight, uses same allocation mechanism No cost in common case (specialized GC) 0 Identify source and target blocks (see paper for heuristics) Evacuate objects in source blocks Allocate into target blocks Opportunistic Leave in place if no space, or object pinned
18 Other Optimizations Implicit Marking Small objects implicitly mark next line Large objects mark lines exactly Most Line objects mark small V. Fast common case Implicit line mark Overflow Allocation Multi-line objects may skip many small holes
Overflow allocation (used on failure) Large objects uncommonV. effective solution 19 Results Complete data available at: http://cs.anu.edu.au/~Steve.Blackburn/pubs 20 Evaluation 20 Benchmarks Collectors DaCapo SPECjvm98
Replay compiler Discard outliers Report 95th %ile Hardware Core 2 Duo 2.4GHz, 32KB L1, 4MB L2, 2GB RAM AMD Athlon 3500+ 2.2GHz, 64KB L1, 512KB `L2, 2GB RAM PowerPC 970 1.6GHz, 32KB L1, 512KB L2, 2GB RAM Please see the paper for details.
21 Mutator Time Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 22 Minimum Heap 23 GC Time Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 24 Total Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 25
Generational Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 26 Sticky Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 27 PseudoJBB 2000 On 2.4GHz Core 2 Duo 28 PseudoJBB 2000 On 2.4GHz Core 2 Duo 29
Prior Work http://www.ibm.com/developerworks/ibm/library/igarbage1/ IBM product collector Mark-Region not characterized Collector not evaluated Product and basis for other research [Domani et al 2000][Kermany & Petrank 2006] 30 Mark-Region Collection Mark-Sweep Sweep-to-Free Free-list + trace + sweep-to-free Compact
Good Good locality locality Simple, Simple, very very fast fast collection collection Space Space
efficient efficient Excellent Excellent performance performance Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 32 Open Source Code available in JikesRVM 2.9.3 onward. http://www.jikesrvm.org Complete data available at:
http://cs.anu.edu.au/~Steve.Blackburn/pubs 33 Research History PLDI 1998 Clinger & Hanson postulated the radioactive decay model for object lifetimes Genesis of Older-First [Stefanovic, McKinley, Moss OOPSLA99] 34 Garbage Collection Hypotheses Generational hypothesis: younger objects die quickly, so collect them first Older-first hypothesis: the collector can
collect less the longer it waits Age ordered heap Survival function s(v) for object lifetime distribution s(v) younger 0 older 1/2V V 35
38 39 40 Increments Belts Combines generational and older-first Ulterior Reference Counting [BM OOPSLA03] Reference count on-per-object basis Responsiveness and throughput MMTk: [BCM SIGMETRICS04 ICSE04] Toolkit for building & understanding GC Motivated todays work 37
Garbage Collection is the Answer to All Your Problems Improves data and code locality [Huang et al. OOPSLA02 ISMM04, VEE04] Cooperative GC optimizations Colocation [Guyer OOPSLA05] Free-me [Guyer et al. PLDI06] Finds leaks [Bond ASPLOS06, Jump POPL07] Tolerates leaks [Bond OOSLA08] Helps with dynamic software updating! [Subramaniam, Hicks ??08]
DaCapo Benchmarks [Blackburn et al. OOPSLA06 CACM08] 38 380C Where are we & where we are going Why you need to care about workloads Managed languages Dynamic compilation Inlining Garbage collection Opportunity to improve data locality on-the-fly Read: X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng, The Garbage Collection Advantage: Improving Program Locality, ACM Conference on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 69-80, Vancouver, Canada, October 2004.
Alias analysis Dependence analysis Loop transformations EDGE architectures
We are proud to present you the winners of our contest: SAFETY AT WORK And the winner is … The city council maintenance team ! And finally, a special prize to : The shooting gallery assistant * * * *...
value driver (KVD) formula. The key value driver formula is superior to alternative methodologies because it is cash flow based and links cash flow to growth and ROIC. Economic-profit model. The economic-profit model leads to results consistent with the KVD...
1990 z Vesoljskim čolničkom Discovery reflektor tipa Cassegrain, zrcalo premera 240 cm 569 km nad tlemi, obhod v 97 minutah (8 km/s) več inštrumentov, ki merijo v različnih valovnih dolžinah sprva neostre slike zaradi sferične aberacije glavnega zrcala od popravila...
Provide training to all managers, supervisors and workers as well as contractors and temporary workers on: safety policies and procedures, program functions, emergencies, injury illness reporting, and their rights under the OSH Act. Ensure the training is provided in a...
Calculate solubility of a salt in the presence of a common ion: Estimate the solubility of Ag 2 SO 4 in a solution containing 0.10M Na 2 SO 4. Ksp= 1.2x10-5(similar process to last problem except that the concentration of...
Global Change Information System(GCIS) Long Term Vision: The Global Change Information System (GCIS) is intended to eventually become a unified web based source of authoritative, accessible, usable and timely information about climate and global change for use by scientists, decision...
Specialized or Modified Leaves. In pine trees, the leaves are adapted to living in a dry environment too. Water is locked up as ice during significant portions of the year and therefore not available to the plant; pine leaves possess...