Rethinking Energy-Performance Trade-Off in Mobile Web Page Loading

Rethinking Energy-Performance Trade-Off in Mobile Web Page Loading

Rethinking Energy-Performance Trade-Off in Mobile Web Page Loading Duc Hoang Bui, Yunxin Liu, Hyosu Kim, Insik Shin, Feng Zhao Motivation Web browsers: high energy consumption Core applications on smartphones Battery-powered smartphones: limited energy Necessary to reduce energy consumption 2 Motivation Web browsers: high energy consumption Core applications on smartphones Battery-powered smartphones: limited energy Necessary to reduce energy consumption User experience: uncompromisable factor 1-second delay in Bing search engine results in a 2.8% drop in revenue per user [1]

As users migrate to mobile, page load time is perhaps the most important metric we have [2] [1] OReilly Velocity Web Performance and Operations Conference, 2009 [2] Howard Mittman, VP and publisher of Cond Nast, 2015 3 Goal Reduce energy consumption of web page loading without degrading user experience No increase in page load time 4 Approach Analyze architectures and behaviors of popular mobile web browsers Chromium, Firefox, UC Browser on Android Note: Chrome = Chromium + proprietary technologies 5 Approach Analyze architectures and behaviors of popular mobile web browsers Chromium, Firefox, UC Browser on Android Note: Chrome = Chromium + proprietary technologies

Identify energy inefficiency issues 6 Approach Analyze architectures and behaviors of popular mobile web browsers Chromium, Firefox, UC Browser on Android Note: Chrome = Chromium + proprietary technologies Identify energy inefficiency issues Develop energy saving techniques 7 Approach Analyze architectures and behaviors of popular mobile web browsers Chromium, Firefox, UC Browser on Android Note: Chrome = Chromium + proprietary technologies Identify energy inefficiency issues Develop energy saving techniques Evaluate on top 100 U.S. websites Save significant system energy (e.g., 24% on average) while not increasing page load time

8 Energy inefficiency issues Mobile browsers optimized for performance, not energy Direct port from desktop versions Maximum processing speed regardless of input data Overhead and redundant computation Underutilization of heterogeneous architectures 9 Energy inefficiency issues (1/3) High energy cost of progressive web resource processing For each small data, the whole data rendering pipeline executed E.g., read system calls return only 1.3 KB data on average from network Data Processing Data Data

Data Processing Processing Processing Time 10 Energy inefficiency issues (1/3) High energy cost of progressive web resource processing For each small data, the whole data rendering pipeline executed E.g., read system calls return only 1.3 KB data on average from network High inter-process communication (IPC) overhead Multi-process architecture browsers The Internet Browser Process IO thread

Renderer Process Rending Engine Main thread Data and control flow Process boundary GPU Thread Compositor Chromium web browser architecture 11 Energy inefficiency issues (2/3) Unnecessary high painting rate Visible screen changes can be very small during web page loading Painting: from models in memory to pixels on screen Resource HTML document Rendering Model

Document Object Model (DOM) tree Painting Image Pixels 12 Energy inefficiency issues (2/3) Unnecessary high painting rate Visible screen changes can be very small during web page loading Painting: from models in memory to pixels on screen E.g. Loading instagram.com, containing no animation Number of paints Average 23-32 frames/s (Chromium, Firefox), fixed 60 fps on UC Browser 90% of paints generate zero visible changes on screen (in Chromium) Off-screen paints 300

200 100 0 1 10 20 30 40 50 60 70 80 Screen changes per paint (%) Bin 90 100 13

Energy inefficiency issues (3/3) Underutilization of energy-efficient little cores on big.LITTLE architecture Energy consumption (uAh) Current OS scheduler schedules threads based on load instead of quality of service (QoS) 800 700 600 500 400 300 Little core 200 Big core 100 0 500 700 900 1100 1300 1500 1700 1900 Frequency (MHz) (a) Energy consumption on Samsung S5 Exynos 14

Energy inefficiency issues (3/3) Underutilization of energy-efficient little cores on big.LITTLE architecture Current OS scheduler schedules threads based on load instead of quality of service (QoS) E.g. Loading instagram.com 800 700 600 500 400 300 Little core 200 Big core 100 0 500 700 900 1100 1300 1500 1700 1900 Frequency (MHz) Execution time (%) 0 Web browser threads Energy consumption (uAh)

Chromium: 89% of threads time on big cores Firefox: 84% of Gecko rendering engine on big cores 5 10 15 20 25 Big cores Little cores blank (a) Energy consumption on Samsung S5 Exynos (b) Execution time of Chromium threads 15 Energy saving techniques Rethink energy-performance trade-off Energy consumption: first-class citizen on smartphones

Reduce redundant computation Adjust processing to the user-perceived content changes Utilize energy efficiency on heterogeneous architectures 16 Network-aware Resource Processing Perform batch processing of web resources Reduce overhead on small data sizes Trade-off: energy saving vs. delay Large batch size: lower energy but high delay Progressive processing: lower delay but high overhead Data Data Processing Processing Progressive processing Data Data

Processing Processing Time Batch processing Data Processing Time 17 Network-aware Resource Processing Batch size: adaptive to user-perceived content changes Download speed: light-weight approximation of user-perceived content changes Increase on fast networks to save more energy Decrease on slow networks to reduce delay Batch size determined by a buffer threshold : determines the maximum delay for a chunk of data (e.g., 0.5 sec) Buffer size Buffer threshold

Batch size : maximum delay time in the buffer 18 Adaptive Content Painting Aggregate multiple content paints Reduce unnecessary computation of small-visible-change paints High painting rate Content1 Content2 Paint2 Paint1 Display1 Display2 Time Adaptive painting rate Content1 Content2

PaintA Display2 Time 19 Adaptive Content Painting Aggregate multiple content paints Reduce unnecessary computation of small-visible-change paints Trade-off between user experience (UX) and energy Low frame rate: less energy but worse UX High frame rate: smoother UX but higher energy consumption 20 Adaptive Content Painting Aggregate multiple content paints Reduce unnecessary computation of small-visible-change paints Trade-off between user experience (UX) and energy Low frame rate: less energy but worse UX High frame rate: smoother UX but higher energy consumption paint_rate parameter: maximum content painting rate

Dynamically adapt to content changing speed Light-weight approach Increase linearly when content changes fast Decrease to a minimum value when content changes slowly 21 Application-Assisted Scheduling Better utilize little cores on big.LITTLE architecture Leverage internals of applications for scheduling Schedule threads according to QoS QoS requirement: frame painting time of browser Load-based scheduling High load Little cores Big cores Low load QoS-based scheduling QoS violated

Little cores Big cores QoS satisfied 22 Application-Assisted Scheduling Better utilize little cores on big.LITTLE architecture Leverage internals of applications for scheduling Schedule threads according to QoS QoS requirement: frame painting time of browser Dynamic thread-to-core assignment Move threads to big cores: when QoS about to be violated Bring threads back to little cores: when QoS satisfied Load-based scheduling High load Little cores Big cores

Low load QoS-based scheduling QoS violated Little cores Big cores QoS satisfied 23 Implementation Prototype based on Chromium version 38 (16 million lines of code) Buffered resource handler: Network-aware Resource Processing VSync monitor: Adaptive Content Painting Thread management module: Application-Assisted Scheduling Disk Cache Browser

Process IO thread Network Stack Resource Handlers Data and control flow Process boundary Instrumented module The Internet Resource Dispatcher Host Renderer Process Child IO Thread Renderer Main Resource Dispatcher GPU Thread Browser Main VSync

Monitor Shared Resource Buffer Async Transfer Thread Command and Texture Buffer a Compositor Compositor Raster Worker Rendering Engine Javascript Engine 24 Evaluation Experiment setup

Emulated testbed: repeatable experimentation Common 3G network condition 2 Mbps download, 1 Mbps upload bandwidth, 120 ms RTT Web Page Replay tool: record and replay pages Data set Top 100 websites in the U.S. by Alexa.com in May 2014 Devices S5-E: Galaxy S5 Exynos (big.LITTLE processor) S5-S: Galaxy S5 Snapdragon (symmetric processor) Metric: Page load time (W3C Navigation Timing specification) Automation tool Two modules: on smartphone and on PC controlling Monsoon power monitor, time synchronized Each configuration and website tested at least 5 times 25 Video demo: facebook.com cps.kaist.ac.kr/eBrowser 26

Effectiveness of all techniques Galaxy S5 Exynos (big.LITTLE architecture) 1 1 0.8 0.8 0.6 0.6 CDF CDF 24.4% system energy saving, including LCD screen Page load time decreased by 0.38% (29 ms) 0.4 0.2

0 0 S5-S 10 20 30 40 50 60 70 Average energy saving (%) 0.4 0.2 S5-S 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 Average PLT increase (%) 27 Effectiveness of all techniques Galaxy S5 Exynos (big.LITTLE architecture) 24.4% system energy saving, including LCD screen Page load time decreased by 0.38% (29 ms) Galaxy S5 Snapdragon (symmetric processor) 0.8

0.8 0.6 0.6 0.4 S5-S S5-E 0.2 0 0 CDF CDF 11.7% system energy saving (without Application-Assisted Scheduling technique) Page load time increased by only 0.01% (6.7 ms) 1 1

0.4 0.2 10 20 30 40 50 60 70 Average energy saving (%) S5-S S5-E blank 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 Average PLT increase (%) 28 Effectiveness of each technique Energy saving Application-Assisted Scheduling (AAS): most effective Network-aware Resource Processing (NRP) and Adaptive Content Painting (ACP): similar effectiveness 1 CDF

0.8 0.6 NR P 0.4 0.2 0 -5 5 15 25 35 45 55 Average energy saving (%) on S5-E 29

Effectiveness of each technique Energy saving Application-Assisted Scheduling (AAS): most effective Network-aware Resource Processing (NRP) and Adaptive Content Painting (ACP): similar effectiveness Page load time increase of individual technique is small 1 1 0.8 0.8 0.6 0.6 NR P 0.4 0.2

0 -5 CDF CDF Maximum 0.76% average increase (NRP) on Galaxy S5 Snapdragon 5 15 25 35 45 0.4 0.2 55 Average energy saving (%) on S5-E

NR P 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Average PLT increase (%) on S5-E 30 User perceived experience User study: 18 users, compare our vs. default browsers Test 1: observe loading speed and smoothness of 10 random websites Test 2: do real web browsing for 5 minutes 31 User perceived experience User study: 18 users, compare our vs. default browsers Test 1: observe loading speed and smoothness of 10 random websites Test 2: do real web browsing for 5 minutes Results: Minimal difference between our and default browsers All users want to use our revised browser

User experience 72% users would always use, 28% users would use when low battery Better Same 0.04 -0.02 -0.12 -0.18 Worse 32 Case study: All techniques Significant reduction of power consumption E.g., System power reduction: 5.3 W (Default) vs. 1.4 W (ours) Power consumption (W) 8

6 Default Ours 4 2 0 Time (sec) Loading infusionsoft.com 33 Case study: AAS technique Significant increase of utilization of little cores 100 100 80 60 40

20 0 0 5000 10000 15000 20000 Time (sec) Core type utilization (%) Core type utilization (%) E.g., 25% (Default) vs. 60% (Application-Assisted Scheduling) 80 60 40 20 0 0 5000 Little cores Loading infusionsoft.com

10000 15000 20000 Time (sec) 34 Case study: NRP and ACP techniques Significant reduction of threads execution time E.g., Chrome_ChildIO thread execution time reduced by 65% (NRP only) Execution time (sec) Loading infusionsoft.com Web browser threads 0 1 2 3 4

5 6 7 Default 8 NRP 9 10 blank 35 Evaluations on other environments and browser Evaluation Average system energy saving (%)

Average page load time increase (%) 21.8 0.1 22.5 0.4 19.6 -1.7 10.5 1.7 Fast network (20 Mbps download, 10 Mbps upload, 50 ms RTT) 3G network

(3G network interface used) Web page loading with cached content Firefox web browser Significant system energy saving without page load time increase Applicable for other web browsers 36 Evaluations on other environments and browser Evaluation Average system energy saving (%) Average page load time increase (%) 21.8 0.1 22.5 0.4

19.6 -1.7 10.5 1.7 Fast network (20 Mbps download, 10 Mbps upload, 50 ms RTT) 3G network (3G network interface used) Web page loading with cached content Firefox web browser Significant system energy saving without page load time increase Applicable for other web browsers Speed Index metric increased only slightly (1.8%, on average) (Above numbers are on Galaxy S5 Exynos big.LITTLE) 37

Related work Energy saving for mobile web browsers Chameleon [MobiSys11]: changes color to save energy on OLED screens Thagarajan et al. [WWW12]: measures energy and provides guidelines (e.g., avoid complex JavaScripts) Zhu et al. [HPCA13]: uses statistical inference models Limitations: Ignored JavaScript and dynamic contents 38 Related work Energy saving for mobile web browsers Chameleon [MobiSys11]: changes color to save energy on OLED screens Thagarajan et al. [WWW12]: measures energy and provides guidelines (e.g., avoid complex JavaScripts) Zhu et al. [HPCA13]: uses statistical inference models Limitations: Ignored JavaScript and dynamic contents Our work Deal with trade-offs inside web browsers Others focus on the characteristics of web pages (primitives, colors, network accesses) Orthogonal with other approaches (e.g., changing color)

Ours can be integrated with others to further improve energy efficiency Tested on real-world websites and smartphones 39 Conclusion Identify energy inefficiency issues in mobile web browsers Propose energy saving techniques 1. Network-aware Resource Processing 2. Adaptive Content Painting 3. Application-Assisted Scheduling Implement on popular mobile web browsers (Chromium and Firefox for Android) on commercial smartphones (Samsung Galaxy S5 phones) Evaluate on top 100 U.S. websites: save significant system energy while not increasing page load time 24.4% system energy saving while decreasing 0.38% page load time on a big.LITTLE phone 40 41 Case study: AAS technique Significant increase of little cores utilization on big.LITTLE architecture E.g., 25% (Default) vs. 60% (Application-Assisted Scheduling)

Default 100 100 80 60 40 20 0 0 5000 10000 15000 20000 Time (sec) Core type utilization (%) Core type utilization (%) Decrease of load on big cores Ours

80 60 40 20 0 0 5000 Little cores Loading infusionsoft.com 10000 15000 20000 Big cores Time (sec) 42

Recently Viewed Presentations

  • Tri-Council Harmonization: the Nuts and Bolts CAGS Annual

    Tri-Council Harmonization: the Nuts and Bolts CAGS Annual

    CIHR- 400 awards , NSERC- 800 awards, SSHRC-1300 awards These allocations have been determined for the 2014-2016 competitions Formula for calculating awards allocations is based on the average distribution of the number of CGS M awards held at each university...
  • PowerPoint-Präsentation

    PowerPoint-Präsentation

    S shields, C Caledonian provinces, V Variscan provinces, R rifts, O orogens Refraction profiles across North America, (reduction velocity 6km/s) all the determination of lateral velocity variations: PmP Moho reflection Pn Moho refraction Pg direct crustal wave Reflection data often...
  • NIFC-NIRSC Basic Radio Operations Training

    NIFC-NIRSC Basic Radio Operations Training

    NIFC-NIICD-NIRSC Basic Radio Operations Training. In order to get the full benefit of this training, please run this PowerPoint as a slideshow. For the full effect, have your sound turned ON.
  • The Use of Telehealth to teach Reproductive Health and Life ...

    The Use of Telehealth to teach Reproductive Health and Life ...

    The Use of Telehealth to Teach Reproductive Health and Life Skills to Rural High School Teens ... A review of the progression of the telehealth program established in McDowell County over the past three years to teach reproductive health and...
  • Johannes 20:19-31  Johannes 20:19-31  God nooi ons uit

    Johannes 20:19-31 Johannes 20:19-31 God nooi ons uit

    Kinders moet gewoon die Bybel leer ken! 'n Paar lesers wat vandag se teks as verskillende karakters lees, kan goed werk. Carolyn Brown sê: The story of Thomas is important to children who already ask lots of questions about everything...
  • OTHER DISORDERS Chapter H1 EATING DISORDERS Phillipa Hay

    OTHER DISORDERS Chapter H1 EATING DISORDERS Phillipa Hay

    We should acknowledge sympathetically, not angrily, the benefits which undoubtedly come with a serious eating disorder: its power to oblige people to care and placate, the relief from social and sexual demands, the sense that one's body is now controlled...
  • שקופית 1 - WikiLeaks

    שקופית 1 - WikiLeaks

    Rockets and outposts were positioned near both hospitals and schools, as well as UN positions. Ambulances and medical personnel were forced to facilitate Hamas movement and transportation. The defender also has obligations.
  • Marine Shrimp Conference - Network of Aquaculture Centres in ...

    Marine Shrimp Conference - Network of Aquaculture Centres in ...

    Normal1.dot 1 13-Feb-02 Shrimp Vertebrates No inflammatory response Inflammatory resoponse Pathogen usually persists Pathogen usually cleared Infectious for others Not infectious for others Tolerance to viruses normal Tolerance to viruses rare No antibodies found in serum Antibodies found in serum...