380C Where are we & where we are going Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap a lot? Why you need to care about workloads
Alias analysis Dependence analysis Loop transformations EDGE architectures 1 380C lecture 18 Garbage Collection Why use garbage collection? What is garbage? Reachable vs live, stack maps, etc. Allocators and their collection mechanisms Semispace Marksweep Performance comparisons
Mark Region Incremental age based collection Write barriers: Friend or foe? Generational Beltway 2 Mark Region and Other Advances in Garbage Collection PLDI08: Immix: A Mark-Region Collector With Space Efficiency, Fast Collection, and Mutator Performance Kathryn S. McKinley University of Texas at Austin Stephen M. Blackburn Australian National University
Isnt GC Marka bit retro? Mark-Sweep Compact Semi-Space McCarthy, 1960 Styger, 1967 Cheney, 1970 Languages without automated garbage collection are getting out of fashion. The chance of running into all kinds of memory problems is gradually outweighing the performance penalty you have to pay for garbage collection. Paul Paul Jansen,
Jansen, managing managing director director of of TIOBE TIOBE Software, Software, in in Dr Dr Dobbs, Dobbs, April April 2008 2008 4 GC Fundamentals The TimeSpace Tradeoff 5 GC Fundamentals The TimeSpace Tradeoff
[Cheney [Cheney 1970] 1970] Bump allocation + trace + evacuate 8 Mark-Sweep Free List Allocation + Trace + Sweep-to-Free Space Space efficient efficient Poor Poor locality locality
Simple, Simple, very very fast fast collection collection Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 9 Mark-Compact Bump Allocation + Trace + Compact Good
Good locality locality Space Space efficient efficient Expensive Expensive multi-pass multi-pass collection collection
Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 10 Semi-Space Bump Allocation + Trace + Evacuation Good Good locality locality Space Space inefficient inefficient Space
Space inefficient inefficient Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 11 Mark-Region with Sweep-To-Region Reclamation Mark-Sweep Sweep-to-Free Free-list + trace + sweep-to-free Mark-Compact Compact
Bump Allocation + Trace + Sweep-to-Region Good Good locality locality Simple, Simple, very very fast fast collection collection Space
Space efficient efficient Excellent Excellent performance performance Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 13 Nave Mark-Region 0 Contiguous allocation into regions Excellent locality
For simplicity, objects cannot span regions Simple mark phase (like mark-sweep) Mark objects and their containing region Unmarked regions can be freed 14 Immix Efficient Mark-Region Garbage Collection 15 Lines and Blocks Large Regions More contiguous allocation
Fragmentation (false marking) Small Regions Fragmentation (cant fill blocks) N pages Free Increased metadata o/h Constrained object sizes Lines & Blocks Recyclable lines TLB locality, cache locality Objects span lines Less fragmentation Free
approx 1 cache line 0 lines Recyclable Block > 4 X max object size Lines marked with objects Fast common case 16 Allocation Policy (Recycling) Recycle partially marked blocks first Minimizes fragmentation Maximizes sharing of freed blocks Recycle in address order We explored other options
Allocate into free blocks last 17 Opportunistic Defragmentation Opportunistically evacuate fragmented blocks Lightweight, uses same allocation mechanism No cost in common case (specialized GC) 0 Identify source and target blocks (see paper for heuristics) Evacuate objects in source blocks Allocate into target blocks Opportunistic Leave in place if no space, or object pinned
18 Other Optimizations Implicit Marking Small objects implicitly mark next line Large objects mark lines exactly Most Line objects mark small V. Fast common case Implicit line mark Overflow Allocation Multi-line objects may skip many small holes
Overflow allocation (used on failure) Large objects uncommonV. effective solution 19 Results Complete data available at: http://cs.anu.edu.au/~Steve.Blackburn/pubs 20 Evaluation 20 Benchmarks Collectors DaCapo SPECjvm98
Replay compiler Discard outliers Report 95th %ile Hardware Core 2 Duo 2.4GHz, 32KB L1, 4MB L2, 2GB RAM AMD Athlon 3500+ 2.2GHz, 64KB L1, 512KB `L2, 2GB RAM PowerPC 970 1.6GHz, 32KB L1, 512KB L2, 2GB RAM Please see the paper for details.
21 Mutator Time Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 22 Minimum Heap 23 GC Time Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 24 Total Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 25
Generational Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 26 Sticky Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo 27 PseudoJBB 2000 On 2.4GHz Core 2 Duo 28 PseudoJBB 2000 On 2.4GHz Core 2 Duo 29
Prior Work http://www.ibm.com/developerworks/ibm/library/igarbage1/ IBM product collector Mark-Region not characterized Collector not evaluated Product and basis for other research [Domani et al 2000][Kermany & Petrank 2006] 30 Mark-Region Collection Mark-Sweep Sweep-to-Free Free-list + trace + sweep-to-free Compact
Good Good locality locality Simple, Simple, very very fast fast collection collection Space Space
efficient efficient Excellent Excellent performance performance Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo 32 Open Source Code available in JikesRVM 2.9.3 onward. http://www.jikesrvm.org Complete data available at:
http://cs.anu.edu.au/~Steve.Blackburn/pubs 33 Research History PLDI 1998 Clinger & Hanson postulated the radioactive decay model for object lifetimes Genesis of Older-First [Stefanovic, McKinley, Moss OOPSLA99] 34 Garbage Collection Hypotheses Generational hypothesis: younger objects die quickly, so collect them first Older-first hypothesis: the collector can
collect less the longer it waits Age ordered heap Survival function s(v) for object lifetime distribution s(v) younger 0 older 1/2V V 35
38 39 40 Increments Belts Combines generational and older-first Ulterior Reference Counting [BM OOPSLA03] Reference count on-per-object basis Responsiveness and throughput MMTk: [BCM SIGMETRICS04 ICSE04] Toolkit for building & understanding GC Motivated todays work 37
Garbage Collection is the Answer to All Your Problems Improves data and code locality [Huang et al. OOPSLA02 ISMM04, VEE04] Cooperative GC optimizations Colocation [Guyer OOPSLA05] Free-me [Guyer et al. PLDI06] Finds leaks [Bond ASPLOS06, Jump POPL07] Tolerates leaks [Bond OOSLA08] Helps with dynamic software updating! [Subramaniam, Hicks ??08]
DaCapo Benchmarks [Blackburn et al. OOPSLA06 CACM08] 38 380C Where are we & where we are going Why you need to care about workloads Managed languages Dynamic compilation Inlining Garbage collection Opportunity to improve data locality on-the-fly Read: X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng, The Garbage Collection Advantage: Improving Program Locality, ACM Conference on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 69-80, Vancouver, Canada, October 2004.
Alias analysis Dependence analysis Loop transformations EDGE architectures
Luxottica India Eyewear Pvt. Ltd. ("the taxpayer") is a part of Luxottica group which is engaged in design, manufacture and distribution of sun glasses and prescription frames in mid and premium price categories.
Economics and Law CST part 1b Ross Anderson, Richard Clayton Philosophies of ethics Authority theories mostly derive from religion. But God usually talks via scriptures or a priesthood; so how do you resolve disputes?
Doctrine. XYZ Corporation has just completed the sale of all of its assets for $1 million. Following the absolute priority of claims doctrine, show how these proceeds should be distributed to the creditors and owners of XYZ Corporation.
Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is a pair (key, value) Main methods of the Priority Queue ADT insert(k, x) inserts an entry with key k and value x removeMin() removes...
Brick with e-Waste contents Other shapes/sizes of Timber /Wood & Brick substitutes are Boards/ Panels/legs/ support stands & Slabs/ ledges/ blocks /side beam or border stone as well as Marble/ Granite slabs. ... WIPO) European Patent Office/EPO has issued a...
Civil Rights Movement. More direct action in the 1950s and 60s than previously. Non-violent but direct confrontation under MLK, CORE, SCLC, SNCC. More militant and prone to violence under Stokely Carmichael (SNCC, Black Power) and Malcolm X (Black Muslims or...
Definition: Broadly defined, risk management includes any activity, process or policy to reduce liability exposure. From both a client safety and a financial perspective, it is vital that providers conduct risk management activities aimed at preventing harm to clients and...
The 10 radix (base) For any radix r, the value of a number with n-digit integer part and m-digit fraction part, can be expressed: The digit value The weight given by the ri , where i represent the position of...
Ready to download the document? Go ahead and hit continue!