Scalable Parallel Computing on Clouds

Scalable Parallel Computing on Clouds

SCALABLE PARALLEL COMPUTING ON CLOUDS : EFFICIENT AND SCALABLE ARCHITECTURES TO PERFORM PLEASINGLY PARALLEL, MAPREDUCE AND ITERATIVE DATA INTENSIVE COMPUTATIONS ON CLOUD ENVIRONMENTS Thilina Gunarathne

Figure 1: A sample MapReduce execution flow Figure 2: Steps of a typical MapReduce computation

Map Task Task Scheduling Data read

Map execution Collect Reduce Task

Spill Merge Shuffle

Merge Reduce Write

Execution output Figure 3: Structure of a typical dataintensive iterative application

Figure 4: Multi-Dimensional Scaling SMACOF application architecture using iterative MapReduce Optional Step

BC: Calculate BX Map Reduce Merge

X: Calculate invV Reduce Merge Map (BX) New Iteration

Calculate Stress Map Reduce

Merge Figure 5 : Bio sequence analysis pipeline[14] Figure 6: Classic cloud processing architecture for pleasingly parallel computations

Figure 7: Hadoop MapReduce based processing model for pleasingly parallel computations Figure 8 Cap3 application execution cost with different EC2 instance types

Figure 9 : Cap3 applciation compute time with different EC2 instance types C o m p u te T im e ( s )

2000 1500 1000

500 0 C a p 3 C o m p u te T i m e

Figure 10: Parallel efficiency of Cap3 application using the pleasingly parallel frameworks Figure 11: Cap3 execution time for single file per core using the pleasingly parallel frameworks

Figure 12 : Cost to process 64 BLAST query files on different EC2 instance types Figure 13 : Time to process 64 BLAST query files on different EC2 instance types 2500

B L A S TC o m p u te T i m e C o m p u te T im e ( s ) 2000 1500

1000 500 0 Figure 14: Time to process 8 query files using BLAST application on different Azure instance types

Figure 15 : BLAST parallel efficiency using the pleasingly parallel frameworks Figure 16 : BLAST average time to process a single query file using the pleasingly parallel

frameworks Figure 17 : Cost of using GTM interpolation application with different EC2 instance types Figure 18 : GTM Interpolation compute time

with different EC2 instance types 600 G T M C o m p u te T i m e C o m p u te T im e ( s )

500 400 300 200 100 0

Figure 19: GTM Interpolation parallel efficiency using the pleasingly parallel frameworks Figure 20 : GTM Interpolation performance per core using the pleasingly parallel frameworks Figure 21: MapReduceRoles4Azure: Architecture for

implementing MapReduce frameworks on Cloud environments using cloud infrastructure services Figure 22: Task decomposition mechanism of SWG pairwise distance calculation MapReduce application

Figure 23: SWG MapReduce pure performance Figure 24: SWG MapReduce relative parallel efficiency Figure 25: SWG MapReduce normalized performance

Figure 26:SWG MapReduce amortized cost for clouds Figure 27: Cap3 MapReduce scaling performance Figure 28: Cap3 MapReduce parallel efficiency

Figure 29: Cap3 MapReduce computational cost in cloud infrastructures Figure 30: Twister4Azure iterative MapReduce programming model

Job Start Map Combine

Map Combine Reduce

Merge Add Iteration? Broadcast

Map Combine Reduce

Data Cache Hybrid scheduling of the new iteration Yes No

Job Finish Figure 31: Cache Aware Hybrid Scheduling Figure 32: Twister4Azure tree based broadcast

over TCP with Azure Blob storage as the persistent backup. Blob Storage Workers

N3 N3 N2 N4 N5

N1 N1 N6 N10

Figure 33: MDS weak scaling. Workload per core is constant. Ideal is a straight horizontal line Figure 34: MDS Data size scaling using 128 Azure small instances/cores, 20 iterations

Figure 35: Twister4Azure Map Task histogram for MDS of 204800 data points on 32 Azure Large Instances (graphed only 10 iterations out of 20). Two adjoining bars represent an iteration (2048 tasks per iteration), where each bar represent the different applications inside the iteration.

Figure 36: Number of executing Map Tasks in the cluster at a given moment. Two adjoining bars represent an iteration. Figure 37: KMeans Clustering Scalability. Relative parallel efficiency of strong scaling using 128 million data points.

Figure 38: KMeansClustering Scalability. Weak scaling. Workload per core is kept constant (ideal is a straight horizontal line). Figure 39: Twister4Azure Map Task execution time

histogram for KMeans Clustering 128 million data points on 128 Azure small instances. Figure 40: Twister4Azure number of executing Map Tasks in the cluster at a given moment

Figure 41: Performance of SW-G for randomly distributed inhomogeneous data with 400 mean sequence length. Figure 42: Performances of SW-G for skewed distributed inhomogeneous data with 400

mean sequence length Figure 43: Performance of Cap3 for random distributed inhomogeneous data. Figure 44: Performance of Cap3 for skewed

distributed inhomogeneous data Figure 45: Virtualization overhead of Hadoop SW-G on Xen virtual machines Figure 46: Virtualization overhead of Hadoop

Cap3 on Xen virtual machines Figure 47: Sustained performance of cloud environments for MapReduce type of applications Figure 48: Execution traces of Twister4Azure MDS

Using in-memory caching on small instances. (The taller bars represent the MDSBCCalc computation, while the shorter bars represent the MDSStressCalc computation and together they represent an iteration. ) Figure 49: Execution traces of Twister4Azure MDS using Memory-Mapped file based caching on Large

instances. Figure 50: MapReduce-MergeBroadcast computation flow Map

Combine Shuffle Sort

Reduce Merge Broadcast

Figure 51: Map-Collective primitives Figure 52: Map-AllGather Collective Figure 53: Map-AllReduce collective

Figure 54: Example Map-AllReduce with Sum operation Figure 55: MDS Hadoop using only the BC Calculation MapReduce job per iteration to highlight the overhead. 20 iterations, 51200 data points

Figure 56: MDS application implemented using Twister4Azure. 20 iterations. 51200 data points (~5GB). Figure 57: Hadoop MapReduce MDS-BCCalc histogram

Figure 58: H-Collectives AllGather MDSBCCalc histogram Figure 59: H-Collectives AllGather MDS-BCCalc histogram without speculative scheduling

Figure 60: Hadoop K-means Clustering comparison with HCollectives Map-AllReduce Weak scaling. 500 Centroids (clusters). 20 Dimensions. 10 iterations. Figure 61: Hadoop K-means Clustering comparison with HCollectives Map-AllReduce Strong scaling. 500 Centroids (clusters). 20 Dimensions. 10 iterations.

Figure 62 Twister4Azure K-means weak scaling with Map-AllReduce. 500 Centroids, 20 Dimensions. 10 Figure 63: Twister4Azure K-means Clustering strong scaling. 500 Centroids, 20 Dimensions, 10 iterations.

Figure 64: HDInsight KMeans Clustering compared with Twister4Azure and Hadoop 1400 Hadoop AllReduce 1200

Hadoop MapReduce 1000 Twister4Azure

AllReduce Time (s) 800 600

Twister4Azure Broadcast 400

Twister4Azure 200 HDInsight (AzureHadoop)

0 32 x 32 M 64 x 64 M

128 x 128 M Num. Cores X Num. Data Points 256 x 256 M

Recently Viewed Presentations

  • 1. During equinox, which marks the first day

    1. During equinox, which marks the first day

    During which two lunar phases would a crescent moon be visible? 4 and 6. 3 and 7. 1 and 5. 2 and 8. 6. Neap tides occur when the Sun and moon pull on Earth's waters at a 90o angle...
  • Chapter 6: Rational Expressions and Equations

    Chapter 6: Rational Expressions and Equations

    Chapter 6 Rational Expressions and Equations ยง 6.1 Simplifying, Multiplying, and Dividing Rational Expressions Simplifying by Factoring Basic Rules of Fractions Simplifying by Factoring Simplifying by Factoring Simplifying by Factoring Multiplying Rational Expressions Simplifying the Product Dividing Rational Expressions Simplifying...
  • Chapter 18

    Chapter 18

    1920's saw expression of new values as an outcome of WWI, women began working in place of men, received right 2 vote in U.S., Britain & other countries ... Old Europe shaken to its core by WWI. ... bravery of...
  • Diapositiva 1 - Trentinosociale.it

    Diapositiva 1 - Trentinosociale.it

    AMICI TRENTINI - AFN - NAAA - SPAI. PROTOCOLLO OPERATIVO . PER GLI ADEMPIMENTI . INERENTI ALL'ADOZIONE . NAZIONALE ED INTERNAZIONALE. Protocolli operativi per le elaborare buone prassi e norme che accompagnano le coppie nel pre e nel post. Esiste...
  • Mental Health: A Presentation for Parents

    Mental Health: A Presentation for Parents

    ://www.cci.health.wa.gov.au / www.AnxietyBC.ca. www.DepressionHurts.ca. U of R Online Therapy Unit: www.OnlineTherapyUser.ca. There can be really helpful information online; however you must be very careful when doing internet research as lots of sites are not reliable sources of information.
  • What stops children from learning - TESOL-SPAIN

    What stops children from learning - TESOL-SPAIN

    Comments about using Internet in a teaching situation.. The . Internet helps me find creative and artistic ways to carry out my lessons . It . serves as my "open library" in finding the easiest, funniest, and most enjoyable ways...
  • MLA Format - Chandler Unified School District / Home Page

    MLA Format - Chandler Unified School District / Home Page

    MLA Format. All papers (except poetry) turned in to ELA must be submitted in ELA format. The top of your first page is set up like this: ... MLA Works Cited: The Basics. Begin your Works Cited page on a...
  • Tomi T Ahonen presentation - Mobile Monday Tokyo

    Tomi T Ahonen presentation - Mobile Monday Tokyo

    Tomi T Ahonen in Services for UMTS 2002 Mobile Industry Metrics '007 Mobile subscribers end June 007 were 3.0 B (2007 Informa) 28.8% with 2+ subscriptions ie 2.1 B unique users (2007 Informa) Mobile telecoms 2006 total revenues $646B (2007...