书签分享收藏举报版权申诉 / 12

立即下载

当前位置：首页 > 教育专区 > 高考资料 > 2022年2022年谷歌文件系统双语 2.pdf

2022年2022年谷歌文件系统双语 2.pdf

上传人：C****o

文档编号：39899127

上传时间：2022-09-08

格式：PDF

页数：12

大小：330.78KB

( 4.5 )

《2022年2022年谷歌文件系统双语 2.pdf》由会员分享，可在线阅读，更多相关《2022年2022年谷歌文件系统双语 2.pdf（12页珍藏版）》请在淘文阁 - 分享文档赚钱的网站上搜索。

1、The Google File System Sanjay Ghemawat,Howard Gobioff,and Shun-Tak Leung Google?ABSTRACT 概述We have designed and implemented the Google File System,a scalable distributed file system for large distributed data-intensive applications.It provides fault tolerance while running on inexpensive commodity h

2、ardware,and it delivers high aggregate performance to a large number of clients.我们设计和实现了Google File System，简称 GFS，一个可扩展的分布式文件系统，用于大型分布式数据相关应用。它提供了基于普通商用硬件上的容错机制，同时对大量的客户端提供高性能的响应。While sharing many of the same goals as previous distributed file systems,our design has been driven by observations of o

3、ur application workloads and technological environment,both current and anticipated,that reflect a marked departure from some earlier file system assumptions.This has led us to reexamine traditional choices and explore radically different design points.GFS与此前的分布式文件系统具有许多相同的目标，但我们的设计是基于对我们的应用负载和技术环境的

4、观察而来，包含当前状况，也包含今后的发展，这与一些早期的文件系统的假定就有了分别。这驱使着我们去重新考虑传统的选择和探索新的设计点。The file system has successfully met our storage needs.It is widely deployed within Google as the storage platform for the generation and processing of data used by our service as well as research and development efforts that require

5、large data sets.The largest cluster to date provides hundreds of terabytes of storage across thousands of disks on over a thousand machines,and it is concurrently accessed by hundreds of clients.这个文件系统成功的满足了我们的存储需求。在Google 它被广泛的部署，我们的业务用其作为生成和处理数据的存储平台，同时也被用于节省在面对大量数据时的研究和开发成本。当前最大的集群已经可以基于超过一千台机器上的

6、数千个磁盘，来存储上万TB的数据，同时它也支持来自于上万个客户端的访问请求。In this paper,we present file system interface extensions designed to support distributed applications,discuss many aspects of our design,and report measurements from both micro-benchmarks and real world use.在这篇论文中，我们展示了文件系统的接口扩展，用以支持分布式应用，并且针对我们的设计进行的多个方面的讨论，以及

7、在真实环境中运行的度量数据。1.INTRODUCTION 简介We have designed and implemented the Google File System(GFS)to meet the rapidly growing demands of Googles data processing needs.GFS shares many of the same goals as previous distributed file systems such as performance,scalability,reliability,and availability.However,

8、its design has been driven by key observations of our application workloads and technological environment,both current and anticipated,that reflect a marked departure from some earlier file system design assumptions.We have reexamined traditional choices and explored radically different points in th

9、e design space.我们设计实现了GFS来应对来自Google 快速增长的数据处理需求。GFS和此前的分布式文件系统具有某些相同的目标，如性能，可扩展型，可靠性和可用性。然而，GFS的设计被 Google 的应用负载情况及技术环境所驱动，具有和以往的分布式文件系统不同的方面。我们从设计角度重新考虑了传统的选择，针对这些不同点进行了探索。名师资料总结-精品资料欢迎下载-名师精心整理-第 1 页，共 12 页 -First,component failures are the norm rather than the exception.The file system consists

10、of hundreds or even thousands of storage machines built from inexpensive commodity parts and is accessed by a comparable number of client machines.The quantity and quality of the components virtually guarantee that some are not functional at any given time and some will not recover from their curren

11、t failures.We have seen problems caused by application bugs,operating system bugs,human errors,and the failures of disks,memory,connectors,networking,and power supplies.Therefore,constant monitoring,error detection,fault tolerance,and automatic recovery must be integral to the system.第一，组件的失效比异常更加常见

12、。文件系统包含了成百上千的基于普通硬件的存储机器，同时被大量的客户端机器访问，组件的数量和质量决定了在某个时刻一些组件会失效而其中的一些无法从失效状态中恢复。我们曾经见到过由于下面的原因引发的实效：应用缺陷，OS缺陷，人为错误，磁盘/内存/连接器/网络/电源错误等等，因此系统必须包含状态监视、错误检测、容错、自动恢复等能力。Second,files are huge by traditional standards.Multi-GB files are common.Each file typically contains many application objects such as we

13、b documents.When we are regularly working with fast growing data sets of many TBs comprising billions of objects,it is unwieldy to manage billions of approximately KB-sized files even when the file system could support it.As a result,design assumptions and parameters such as I/O operation and blocks

14、izes have to be revisited.第二，传统标准的文件量十分巨大，总量一般都会达到GB级别。文件通常包含许多应用对象，诸如Web文档等。当我们在工作中与日益增长的包含大量对象的TB级的数据进行交互时，管理数以亿计的KB大小的文件是非常困难的。所以，设计假定和参数需要重新定义，如I/O 操作和块大小等。Third,most files are mutated by appending new data rather than overwriting existing data.Random writes within a file are practically non-exi

15、stent.Once written,the files are only read,and often only sequentially.A variety of data share these characteristics.Some may constitute large repositories that data analysis programs scan through.Some may be data streams continuously generated by running applications.Some may be archival data.Some

16、may be intermediate results produced on one machine and processed on another,whether simultaneously or later in time.Given this access pattern on huge files,appending becomes the focus of performance optimization and atomicity guarantees,while caching data blocks in the client loses its appeal.第三，多数

17、的文件变化是因为增加新的数据，而非重写原有数据。在一个文件中的随机写操作其实并不存在。一旦完成写入操作，文件就变成只读，通常也是顺序存储。多种数据拥有这样的特征。构造大型存储区以供数据分析程序操作；运行应用产生的连续数据流；历史归档数据；一台机器产生的会被其他机器使用的中间数据；对于巨大文件的访问模式，“增加”变成了性能优化的焦点，与此同时，在客户端进行数据块缓存逐渐失去了原有的意义。Fourth,co-designing the applications and the file system API benefits the overall system by increasing our

18、 flexibility.For example,we have relaxed GFS s consistency model to vastly simplify the file system without imposing an onerous burden on the applications.We have also introduced an atomic append operation so that multiple clients can append concurrently to a file without extra synchronization betwe

19、en them.These will be discussed in more details later in the paper.第四，统一设计应用和文件系统API 对提升灵活性有着好处。例如，我们将GFS的一致性模型设计的尽量轻巧，使得文件系统得到极大的简化，应用系统也不会背上沉重的包袱。我们还引入了一个原子Append 操作，这样多个客户端可以同时向一个文件增加内容，而不会出现同步问题。这些将会在论文的后续章节进行讨论。Multiple GFS clusters are currently deployed for different purposes.The largest ones

20、 have over 1000 storage nodes,over 300 TB of disk storage,and are heavily accessed by hundreds of clients on distinct machines on a continuous basis.名师资料总结-精品资料欢迎下载-名师精心整理-第 2 页，共 12 页 -多个 GFS集群被部署用于不同的用途。最大的一个拥有1000 个存储节点，300TB的磁盘存储，被上万个用户持续的密集访问。2.DESIGN OVERVIEW 设计概览2.1 Assumptions 假定In designing

21、 a file system for our needs,we have been guided by assumptions that offer both challenges and opportunities.We alluded to some key observations earlier and now lay out our assumptions in more details.在设计符合我们需求的文件系统的时候，我们制定了下述的假定，有挑战也有机会。前面我们提到过一些关键的观察，现在我们将其具体化。?The system is built from many inexpe

22、nsive commodity components that often fail.It must constantly monitor itself and detect,tolerate,and recover promptly from component failures on a routine basis.系统由许多便宜常见的组件构成，它们经常出现错误。必须定期进行监视、检测、容错、以及从错误状态恢复到例行工作状态。?The system stores a modest number of large files.We expect a few million files,eac

23、h typically 100 MB or larger in size.Multi-GB files are the common case and should be managed efficiently.Small files must be supported,but we need not optimize for them.系统存储了一定数目的大型文件。我们期望是数百万个文件，每个大概是100MB以上。GB级文件是常见情形，需要被有效的管理起来。小文件也必须支持，但是我们无需为其优化。?The workloads primarily consist of two kinds of

24、 reads:large streaming reads and small random reads.In large streaming reads,individual operations typically read hundreds of KBs,more commonly 1 MB or more.Successive operations from the same client often read through a contiguous region of a file.A small random read typically reads a few KBs at so

25、me arbitrary offset.Performance-conscious applications often batch and sort their small reads to advance steadily through the file rather than go back and forth.系统的负荷来自于两种读操作：大型顺序读，以及小型随机读。在大型顺序读的情况中，单个操作通常读取MB级别以上的数据。来自相同客户端的连续操作通常读取一个文件的连续区间。小型随机读通常读取若干KB的数据据。关注性能的应用往往会将小型读操作进行打包和排序，从而使得在文件中平稳的读取，

26、而非反复前后跳转。?The workloads also have many large,sequential writes that append data to files.Typical operation sizes are similar to those for reads.Once written,files are seldom modified again.Small writes at arbitrary positions in a file are supported but do not have to be efficient.系统的负荷也有许多大型的连续的Appe

27、nd写操作。通常操作的大小与读取相似。一旦完成写入，文件几乎不会再被修改。系统也会支持小型随机写入操作，但是效率不会很高。?T he system must efficiently implement well-defined semantics for multiple clients that concurrently append to the same file.Our files are often used as producer-consumer queues or for many-way merging.Hundreds of producers,running one pe

28、r machine,will concurrently append to a file.Atomicity with minimal synchronization overhead is essential.The file may be read later,or a consumer may be reading through the file simultaneously.对于多个客户端并发向同一个文件进行Append 操作的情况，系统必须有效的实现良好定义的语义。我们的文件常被用作“生产者-消费者队列“或者“多路合并”。数以百计的生产者，每个运行于单独的机器，并行向同一个文件添加

29、数据。降低同步的困扰必不可少。文件可能后续被读取，也许一个消费者会同时读取。?High sustained bandwidth is more important than low latency.Most of our target applications place a 名师资料总结-精品资料欢迎下载-名师精心整理-第 3 页，共 12 页 -premium on processing data in bulk at a high rate,while few have stringent response time requirements for an individual read

30、 or write.持续的高带宽比低延迟更为重要。多数目标应用期望以高速率对块数据进行处理，同时只有少量应用对单个读写操作的响应时间有严格的要求。2.2 Interface 接口GFS provides a familiar file system interface,though it does not implement a standard API such as POSIX.Files are organized hierarchically in directories and identified by pathnames.We support the usual operatio

31、ns to create,delete,open,close,read,and write files.GFS提供了一套常见的文件系统接口，虽然它并没有实现诸如POSIX这样的标准API。文件在目录中以层次化的形式进行组织，可以通过路径名称进行标识。我们提供了诸如创建、删除、打开、关闭、读和写文件这样的常见操作。Moreover,GFS has snapshot and record append operations.Snapshot creates a copy of a file or a directory tree at low cost.Record append allows m

32、ultiple clients to append data to the same file concurrently while guaranteeing the atomicity of each individual clients append.It is useful for implementing multi-way merge results and producerconsumer queues that many clients can simultaneously append to without additional locking.We have found th

33、ese types of files to be invaluable in building large distributed applications.Snapshot and record append are discussed further in Sections 3.4 and 3.3 respectively.GFS也拥有快照和Append 记录操作。快照以最低成本创建一个文件或一个目录树的拷贝。Append 记录允许多个客户端同时向一个文件进行Append 操作，同时确保每个单独客户端Append 的原子性。这一点对于实现“多路合并”和“生产者-消费者队列”非常有意义，许多

34、客户端可以同时进行Append 操作而不受额外的加锁限制。我们发现在构造大型分布式应用时，这种类型的文件非常有价值。快照和Append 记录将在3.4 和 3.5 章中详细讨论。2.3 Architecture 架构A GFS cluster consists of a single master and multiple chunkservers and is accessed by multiple clients,as shown in Figure 1.Each of these is typically a commodity Linux machine running a user

35、-level server process.It is easy to run both a chunkserver and a client on the same machine,as long as machine resources permit and the lower reliability caused by running possibly flaky application code is acceptable.一个 GFS集群由一个master 和多个块服务器（Chunkserver）组成，被多个客户端所访问，如图1 所示。每个机器都是廉价的Linux 机器，运行用户态服

36、务进程。也可以将块服务器和客户端在同一台机器上运行，只要机器的资源允许，或者可以接受可能有问题的应用代码带来的低稳定性。名师资料总结-精品资料欢迎下载-名师精心整理-第 4 页，共 12 页 -Files are divided into fixed-size chunks.Each chunk is identified by an immutable and globally unique 64 bit chunk handle assigned by the master at the time of chunk creation.Chunkservers store chunks on

37、 local disks as Linux files and read or write chunk data specified by a chunk handle and byte range.For reliability,each chunk is replicated on multiple chunkservers.By default,we store three replicas,though users can designate different replication levels for different regions of the file namespace

38、.文件被分割成固定大小的块。每个块都使用一个不变的全局唯一的64 位块句柄进行标识，这个句柄在master 创建块时进行分配。块服务器在本地磁盘上像Linux 文件一样存储块，根据指定的块句柄和字节范围来读写块数据。为了可靠性，每个块被复制在多个块服务器上。缺省情况下，我们保存三分复制，用户也可以为文件名称空间的不同地区指定不同的复制级别。The master maintains all file system metadata.This includes the namespace,access control information,the mapping from files to ch

39、unks,and the current locations of chunks.It also controls system-wide activities such as chunk lease management,garbage collection of orphaned chunks,and chunk migration between chunkservers.The master periodically communicates with each chunkserver in HeartBeat messages to give it instructions and

40、collect its state.Master 维护所有的文件系统元数据。它将包括名字空间，访问控制信息，文件与块的链接，以及块的当前位置。它还控制着系统层面的活动，诸如块租借管理，孤立块的垃圾回收，块服务器之间的块迁移。master 会定期的与块服务器使用心跳消息进行通信，发送指令给块服务器，以及收集块服务器的状态。GFS client code linked into each application implements the file system API and communicates with the master and chunkservers to read or wr

41、ite data on behalf of the application.Clients interact with the master for metadata operations,but all data-bearing communication goes directly to the chunkservers.We do not provide the POSIX API and therefore need not hook into the Linux vnode layer.嵌入与应用中的GFS客户端代码实现了文件系统API，与 master 和块服务器进行通信，代为应用

42、程序读写数据。客户端与 master 交互以进行元数据操作，但是所有的数据通信都将直接访问块服务器。我们没有提供POSIX API，因此无需在Linux vnode层放置钩子。Neither the client nor the chunkserver caches file data.Client caches offer little benefit because most applications stream through huge files or have working sets too large to be cached.Not having them simplifie

43、s the client and the overall system by eliminating cache coherence issues.(Clients do cache metadata,however.)Chunkservers need not cache file data because chunks are stored as local files and so Linux s buffer cache already keeps frequently accessed data in memory.名师资料总结-精品资料欢迎下载-名师精心整理-第 5 页，共 12

44、页 -客户端和块服务器都不会缓存文件数据。客户端进行缓存只有极少的益处，因为多数应用操作巨大的文件，而且工作输出的大小也超出的缓存的范围。没有缓存让客户端和整个系统都变得简单，因为可以忘记缓存同步问题。（然后客户端还是会缓存元数据）块服务器也无需缓存文件数据，因为块在本地文件中存放，Linux 的缓冲区机制已经将频繁访问的数据放进了内存。2.4 Single Master 单 Master Having a single master vastly simplifies our design and enables the master to make sophisticated chunk

45、placement and replication decisions using global knowledge.However,we must minimize its involvement in reads and writes so that it does not become a bottleneck.Clients never read and write file data through the master.Instead,a client asks the master which chunkservers it should contact.It caches th

46、is information for a limited time and interacts with the chunkservers directly for many subsequent operations.单 master 极大的简化了我们的设计，同时也使得master 可以给予全局知识进行复杂的块存储和复制策略。但是我们必须使得master 在读写方面的占用最小化，从而避免让它成为瓶颈。客户端从不直接从master 读写数据。相反的，客户端会询问master 该与哪个块服务器进行交互。而后它会将这个信息缓存一段时间，接下来的操作会直接与这个块服务器进行交互。Let us exp

47、lain the interactions for a simple read with reference to Figure 1.First,using the fixed chunk size,the client translates the file name and byte offset specified by the application into a chunk index within the file.Then,it sends the master a request containing the file name and chunk index.The mast

48、er replies with the corresponding chunk handle and locations of the replicas.The client caches this information using the file name and chunk index as the key.让我们用图1 来解释一下一个简单的读操作的交互过程。首先，使用固定的块大小，客户端将文件名和应用指定的偏移量转换成文件内部的块索引。然后，客户端向master 发送一个请求，包含文件名和块索引。master 响应对应的块句柄和复本的位置。客户端将这些信息进行缓存，使用文件名和块索引

49、作为Key。The client then sends a request to one of the replicas,most likely the closest one.The request specifies the chunk handle and a byte range within that chunk.Further reads of the same chunk require no more client-master interaction until the cached information expires or the file is reopened.In

50、 fact,the client typically asks for multiple chunks in the same request and the master can also include the information for chunks immediately following those requested.This extra information sidesteps several future client-master interactions at practically no extra cost.客户端向复本之一发送一个请求，通常是最近的一个。这个请

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

4.3 金币

版权申诉 word格式文档无特别注明外均可编辑修改；预览文档经过压缩，下载后原文更清晰！ 立即下载

配套讲稿：: 如PPT文件的首页显示word图标，表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
特殊限制：: 部分文档作品中含有的国旗、国徽等图片，仅作为作品整体效果示例展示，禁止商用。设计者仅对作品中独创性部分享有著作权。
关键词：: 2022年2022年谷歌文件系统双语 2022 年谷歌文件系统双语

淘文阁 - 分享文档赚钱的网站所有资源均是用户自行上传分享，仅供网友学习交流，未经上传用户书面授权，请勿作他用。

限制150内

关于本文

本文标题：2022年2022年谷歌文件系统双语 2.pdf
链接地址：https://www.taowenge.com/p-39899127.html