云计算关键技术.ppt
《云计算关键技术.ppt》由会员分享,可在线阅读,更多相关《云计算关键技术.ppt(108页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、云计算关键技术郑伟平2011-7-26Page 2虚拟化技术内容n1 虚拟化定义n2 虚拟化分类n3 全虚拟化与半虚拟化n4虚拟化实现n5虚拟化技术比较与选型n6虚拟化带来的好处n7虚拟化带来的问题n8虚拟化适用范围n9服务器虚拟化过程nMapReduce MapReduce是一个简单易用的并行编程模型,它极大简化了大规模数据处理问题的实现Page 3Divide and Conquer“Work”w1w2w3r1r2r3“Result”“worker”“worker”“worker”PartitionCombineParallelization ChallengesnHow do we as
2、sign work units to workers?nWhat if we have more work units than workers?nWhat if workers need to share partial results?nHow do we aggregate partial results?nHow do we know all the workers have finished?nWhat if workers die?What is the common theme of all of these problems?Common Theme?nParallelizat
3、ion problems arise from:nCommunication between workers(e.g.,to exchange state)nAccess to shared resources(e.g.,data)nThus,we need a synchronization mechanismManaging Multiple WorkersnDifficult becausenWe dont know the order in which workers runnWe dont know when workers interrupt each othernWe dont
4、know the order in which workers access shared datanThus,we need:nSemaphores(lock,unlock)nConditional variables(wait,notify,broadcast)nBarriersnStill,lots of problems:nDeadlock,livelock,race conditions.nDining philosophers,sleepy barbers,cigarette smokers.nMoral of the story:be careful!Current Toolsn
5、Programming modelsnShared memory(pthreads)nMessage passing(MPI)nDesign PatternsnMaster-slavesnProducer-consumer flowsnShared work queuesMessage PassingP1P2P3P4P5Shared MemoryP1P2P3P4P5Memorymasterslavesproducer consumerproducer consumerwork queueBut,now Mapreduce!nMapreduce:Parallel/Distributed Comp
6、uting Programming ModelInput split shuffle output Typical problem solved by MapReducen读入数据:key/value 对的记录格式数据nMap:从每个记录里extract somethingnmap(in_key,in_value)-list(out_key,intermediate_value)n处理input key/value pair n输出中间结果key/value pairsnShuffle:混排交换数据n把相同key的中间结果汇集到相同节点上nReduce:aggregate,summarize,
7、filter,etc.nreduce(out_key,list(intermediate_value)-list(out_value)n归并某一个key的所有values,进行计算n输出合并的计算结果(usually just one)n输出结果Mapreduce FrameworkMapreduce FrameworkShuffle ImplementationPartition and Sort Group Partition function:hash(key)%reducer numberGroup function:sort by keyExample uses:distribute
8、d grepdistributed sort web link-graph reversal term-vector/hostweb access log stats inverted index construction document clustering machine learning statistical machine translation.Model is Widely ApplicableMapReduce Programs In Google Source Tree Google MapReduce ArchitectureSingle Master nodeMany
9、worker beesMany worker beesMapReduce OperationInitial data splitinto 64MB blocksComputed,resultslocally storedM sends datalocation to R workersFinal output writtenMaster informed ofresult locationsExecution overview1.Input files are split into M pieces(16 to 64 MB)Many worker copies of the program a
10、re forked.2.One special copy,the master,assigns map and reduce tasks to idle slave workers3.Map workers read input splits,parse(key,value)pairs,apply the map function,create buffered output pairs.4.Buffered output pairs are periodically written to local disk,partitioned into R regions,locations of r
11、egions are passed back to the master.5.Master notifies reduce worker about locations.Worker uses remote procedure calls to read data from local disks of the map workers,sorts by intermediate keys to group same key records together.Execution overview cont6.Reduce worker passes key plus corresponding
12、set of all intermediate data to reduce function.The output of the reduce function is appended to the final output file.7.When all map and reduce tasks are completed the master wakes up the user program,which resumes the user code.Fault Tolerance:workers master保持一些数据结构。它为每个map和reduce任务存储它们的状态(空闲,工作中,
13、完成),和worker机器(非空闲任务的机器)的标识。nMaster pings workers periodically.No response:worker marked as failed.nCompleted map tasks are reset to idle state,so that they can be restarted,because their results(local to failed worker)are lost.nCompleted reduce tasks do not need to be re-started(output stored in glo
14、bal file system).Reduce tasks are notified of the new map tasks,so they can read unread data from the new locations.Fault Tolerance:MasternMaster writes checkpointsnOnly one master,less chance of failurenIf master failes,MapReduce task aborts.Refinement:Redundant ExecutionnSlow workers significantly
15、 delay completion time nOther jobs consuming resources on machine nBad disks w/soft errors transfer data slowly nSolution:Near end of phase,spawn backup tasks nWhichever one finishes first wins nDramatically shortens job completion time Refinement:Locality OptimizationnMaster scheduling policy:nAsks
16、 GFS for locations of replicas of input file blocks nMap tasks typically split into 64MB(GFS block size)nMap tasks scheduled so GFS input block replica are on same machine or same rack nEffectnThousands of machines read input at local disk speed nWithout this,rack switches limit read rateRefinement:
17、Skipping Bad RecordsnMap/Reduce functions sometimes fail for particular inputs nBest solution is to debug&fixnNot always possible third-party source libraries nOn segmentation fault:nSend UDP packet to master from signal handler nInclude sequence number of record being processed nIf master sees two
18、failures for same record:nNext worker is told to skip the record nCompression of intermediate data nCombinern“Combiner”functions can run on same machine as a mappernCauses a mini-reduce phase to occur before the real reduce phase,to save bandwidthnLocal execution for debugging/testing nUser-defined
19、countersOther RefinementsHadoop MapReduce ArchitectureMaster/Worker ModelLoad-balancing by polling mechanismHistory of Hadoopn2004-Initial versions of what is now Hadoop Distributed File System and Map-Reduce implemented by Doug Cutting&Mike Cafarella nDecember 2005-Nutch ported to the new framework
20、.Hadoop runs reliably on 20 nodes.nJanuary 2006-Doug Cutting joins Yahoo!nFebruary 2006-Apache Hadoop project official started to support the standalone development of Map-Reduce and HDFS.nMarch 2006-Formation of the Yahoo!Hadoop team nMay 2006-Yahoo sets up a Hadoop research cluster-300 nodes nApri
21、l 2006-Sort benchmark run on 188 nodes in 47.9 hours nMay 2006-Sort benchmark run on 500 nodes in 42 hours(better hardware than April benchmark)nOctober 2006-Research cluster reaches 600 Nodes nDecember 2006-Sort times 20 nodes in 1.8 hrs,100 nodes in 3.3 hrs,500 nodes in 5.2 hrs,900 nodes in 7.8 nJ
22、anuary 2006-Research cluster reaches 900 node nApril 2007-Research clusters-2 clusters of 1000 nodes nSep 2008-Scaling Hadoop to 4000 nodes at Yahoo!nApril 2009 release 0.20.0,many improvements,new features,bug fixes and optimizations.分布式文件系统n分布式文件系统特点和基本要求缓存容错和可扩展性Page 28292022/11/18分布式文件系统的特点和基本要求
23、n分布式文件系统的特点 为整个网络上的文件系统资源提供了一个逻辑树结构,用户可以抛开文件的实际物理位置,仅通过一定的逻辑关系就可以查找和访问网络的共享资源。用户能够像访问本地文件一样,访问分布在网络中多个服务器上的文件。n分布式文件系统的顾客、服务员和存储设备分散在各机器上,服务活动必须跨网完成。n存储设备不是单一的集中数据存储器。n分布式文件系统的具体配置和实现可以有很大的不同,有的服务员运行在专用的服务器上,有的机器既是服务员又是顾客。302022/11/18分布式文件系统的特点和基本要求n分布式文件系统的基本要求n透明性n位置透明性:服务员和存储器的多重性和分散性对顾客透明。n移动透明性
24、:用户意识不到资源的移动。n性能透明性:当服务负载在一定范围内变化时,客户程序可以保持满意的性能。n扩展透明性:文件服务可以扩充,以满足负载和网络规模的增长。n性能分布式文件系统比常规文件系统类似(有时更好)的性能和可靠性312022/11/18n容错n为了处理暂时的通信错误,容错设计可以基于最多一次性语义n无状态的服务器:崩溃重启时不需恢复n安全性n身份验证,访问控制,安全通道n效率:应提供比传统文件系统相同或更强的性能和可靠性分布式文件系统的特点和基本要求322022/11/18分布式文件系统的缓存缓存方案的设计需要考虑的问题:n缓存的单位问题n存储部分文件的位置n如何决定各个顾客缓存中的
25、数据是否一致332022/11/18分布式文件系统的缓存n缓存的粒度和地点n缓存的粒度:如果数据单元(即粒度)愈大,则下次访问的数据在顾客方的本地找到的可能性愈大,但传送数据的时间和一致性问题也增加了。反之,粒度太小,通信的开销也随之增加。n缓存的地点在一个各自有主存和磁盘的客户-服务器系统中,有四个地方可以用来存储文件或存储部分文件:服务器磁盘、服务器主存、客户磁盘(如果可用的话)或者客户主存。342022/11/18分布式文件系统的缓存n更新策略、缓存有效性检验和一致性 判定本地缓存的数据副本是否与原本一致,有两个基本方法验证其有效性:n顾客发动的方法。顾客与服务员联系,检查本地数据与原本
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 计算 关键技术
限制150内