高级操作系统高级操作系统 (34).pdf

资源ID：57971245 资源大小：3.37MB 全文页数：55页
资源格式： PDF 下载积分：8金币

快捷下载

会员登录下载

微信登录下载

三方登录下载：

微信扫一扫登录

下载资源需要8金币

邮箱/手机：
温馨提示：	快捷下载时，用户名和密码都是您填写的邮箱或者手机号，方便查询和重复下载（系统自动生成）。如填写123，账号就是123，密码也是123。
支付方式：
验证码：	换一换

账号：
密码：
验证码：	换一换
当日自动登录忘记密码？

友情提示

1、下载资料失败解决办法

2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。

3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。

4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

5、试题试卷类文档，如果标题没有明确说明有答案则都视为没有答案，请知晓。

网站客服

侵权投诉

高级操作系统高级操作系统 (34).pdf

第 12 讲：Scalable Synchronization on Shared-MemoryMultiprocessors-IIMultiprocessor Memory Model&Multiprocessor ProgrammingRecapCache Coherent in Multi-processorMSI 一致性协议MESI 一致性协议ref:Some info are fromPaul McKenney(IBM)Tom Hart(University of Toronto),Frans Kaashoek(MIT),Daniel J.Sorin”A Primer on Memory Consistency and Cache Coherence”,Fabian Giesen”Cache coherency primer”,Mingyu Gao(Tsinghua),Yubin Xia(SJTU)”The C/C+Memory Model:Overview and Formalization”,Mark Batty,etc.”C+11 Memory ConsistencyModel”,Sebastian Gerstenberg;”HOW UBISOFT MONTREAL DEVELOPS GAMES FOR MULTICORE Before&After C+11”,Jeff Preshing;etc.RecapMemory Consistency in Multi-processorSequential ConsistencyTotal Store Order ConsistencyAcquire/Release ConsistencyRelaxed/Weak Consistency.Multiprocessor Programming陈渝(清华大学)第 12 讲2020 年 5 月 9 日4/55.Ways to Achieve Synchronizes-WithC+11/17/20,Go,RUST,Java,陈渝(清华大学)第 12 讲2020 年 5 月 9 日5/55.Real Hardware/CompilerReal hardware doesnt run the code that you wrote.Real compiler doesnt produce the code that you wrote.陈渝(清华大学)第 12 讲2020 年 5 月 9 日6/55.Real Hardware/CompilerReal hardware doesnt run the code that you wrote.Real compiler doesnt produce the code that you wrote.http:/ 12 讲2020 年 5 月 9 日7/55.Real Hardware/CompilerReal hardware doesnt run the code that you wrote.Real compiler doesnt produce the code that you wrote.根据英特尔的规范:在本示例的末尾，r1 和 r2 都等于 0 是合法的陈渝(清华大学)第 12 讲2020 年 5 月 9 日8/55.Real Hardware/CompilerReal hardware doesnt run the code that you wrote.Real compiler doesnt produce the code that you wrote.陈渝(清华大学)第 12 讲2020 年 5 月 9 日9/55.Real Hardware/CompilerReal hardware doesnt run the code that you wrote.Real compiler doesnt produce the code that you wrote.https:/ 12 讲2020 年 5 月 9 日10/55.Real Hardware/CompilerReal hardware doesnt run the code that you wrote.Real compiler doesnt produce the code that you wrote.陈渝(清华大学)第 12 讲2020 年 5 月 9 日11/55.Real Hardware/CompilerReal hardware doesnt run the code that you wrote.Real compiler doesnt produce the code that you wrote.陈渝(清华大学)第 12 讲2020 年 5 月 9 日12/55.Real Hardware/CompilerReal hardware doesnt run the code that you wrote.Real compiler doesnt produce the code that you wrote.Dekkers Algorithm,g+-O2陈渝(清华大学)第 12 讲2020 年 5 月 9 日13/55.C+11 Memory ModelHistoryIn 2011,new versions of the ISO standards for C andC+,informally known as C11 and C+11,wereratified.These standards define a memory model for C/C+Support for this model has recently become availablein popular compilers(GCC 4.4,Intel C+13.0,MSVC11.0,Clang 3.1).陈渝(清华大学)第 12 讲2020 年 5 月 9 日14/55.C+11 Memory ModelWhy C+11 Memory Model在 C+11 之前，其实是没有定义内存模型的，我们所使用的都是一些处理器/编译器暴露的同步原语，比如 GCC 的内联汇编，内建函数之类的,有着不同的实现。C+11 在标准库中引入了 memory model 的意义在于在 High Level Language 层面实现对在多处理器中多线程共享内存的访问，实现跨编译器,OS 和硬件的差异性。陈渝(清华大学)第 12 讲2020 年 5 月 9 日15/55.C+11 Memory ModelHappens-beforeAn important fundamental concept inunderstanding the memory modelA guarantee that memory writes by one specificstatement are visible to another specific statement陈渝(清华大学)第 12 讲2020 年 5 月 9 日16/55.C+11 Memory ModelHappens-beforeint A=0;int B=0;void foo()A=B+1;/(1)B=1;/(2)陈渝(清华大学)第 12 讲2020 年 5 月 9 日17/55.C+11 Memory ModelHappens-before陈渝(清华大学)第 12 讲2020 年 5 月 9 日18/55.C+11 Memory ModelHappens-beforeint A,B;void foo()A=B+1;B=0;陈渝(清华大学)第 12 讲2020 年 5 月 9 日19/55.C+11 Memory ModelHappens-beforeGCC 4.6.1$gcc-S-masm=intel foo.c$cat foo.s.moveax,DWORD PTR _B(redo this at home.)addeax,1movDWORD PTR _A,eaxmovDWORD PTR _B,0.陈渝(清华大学)第 12 讲2020 年 5 月 9 日20/55.C+11 Memory ModelHappens-beforeGCC 4.6.1$gcc-O2-S-masm=intel foo.c$cat foo.s.moveax,DWORD PTR BmovDWORD PTR B,0addeax,1movDWORD PTR A,eax.陈渝(清华大学)第 12 讲2020 年 5 月 9 日21/55.C+11 Memory ModelHappens-beforeGCC 4.6.1int A,B;void foo()A=B+1;asm volatile(:memory);B=0;陈渝(清华大学)第 12 讲2020 年 5 月 9 日22/55.C+11 Memory ModelHappens-beforeGCC 4.6.1$gcc-O2-S-masm=intel foo.c$cat foo.s.moveax,DWORD PTR _Baddeax,1movDWORD PTR _A,eaxmovDWORD PTR _B,0.陈渝(清华大学)第 12 讲2020 年 5 月 9 日23/55.C+11 Memory ModelHappens-before/x86#define COMPILER_BARRIER()asm volatile(:memory)/PowerPC#define RELEASE_FENCE()asm volatile(lwsync:memory)=int Value;std:atomic IsPublished(0);void sendValue(int x)Value=x;/-reordering is prevented here!IsPublished.store(1,std:memory_order_release);陈渝(清华大学)第 12 讲2020 年 5 月 9 日24/55.C+11 Memory Modelstd:atomicx.load(memory order)x.store(T,memory order)Concurrent accesses on atomic locations do not race.The memory order argument specifies orderingconstraints between atomic and non-atomic memoryaccesses in different threads.陈渝(清华大学)第 12 讲2020 年 5 月 9 日25/55.C+11 Memory ModelIf multiple threads access the same variable concurrently,and at leastone thread modifies it,all threads must use C+11 atomic operations.陈渝(清华大学)第 12 讲2020 年 5 月 9 日26/55.C+11 Memory Modelstd:atomic陈渝(清华大学)第 12 讲2020 年 5 月 9 日27/55.C+11 Memory Modelstd:atomic陈渝(清华大学)第 12 讲2020 年 5 月 9 日28/55.C+11 Memory Modelstd:memory_order陈渝(清华大学)第 12 讲2020 年 5 月 9 日29/55.C+11 Memory Modelstd:memory_order_seq_cstThere is a total order over all seq cst operations.This order contributes to inter-thread orderingconstraints.陈渝(清华大学)第 12 讲2020 年 5 月 9 日30/55.C+11 Memory Modelstd:memory_order_seq_cst陈渝(清华大学)第 12 讲2020 年 5 月 9 日31/55.C+11 Memory Modelstd:memory_order_seq_cstsequential consistency直观上，读操作应该返回“最后”一次写入的值。在单处理器系统中，“最后”由程序次序定义。在多处理器系统中，我们称之为顺序连贯(sequential consistency,SC).约束条件在每个处理器内，维护每个处理器的程序次序；在所有处理器间，维护单一的表征所有操作的次序。对于写操作 W1,W2,不能出现从处理器 P1 看来，执行次序为 W1-W2;从处理器 P2 看来，执行次序却为 W2-W1 这种情况。陈渝(清华大学)第 12 讲2020 年 5 月 9 日32/55.C+11 Memory Modelstd:memory_order_release/acquireAn acquire load makes prior writes to other memory locations made by the thread that did therelease visible in the loading thread.陈渝(清华大学)第 12 讲2020 年 5 月 9 日33/55.C+11 Memory Modelstd:memory_order_release/acquireAn acquire load makes prior writes to other memory locations made by the thread that did therelease visible in the loading thread.陈渝(清华大学)第 12 讲2020 年 5 月 9 日34/55.C+11 Memory Modelstd:memory_order_release/acquireAcquire semanticsAcquire semantics is a property that can only apply to operations that read from sharedmemory,whether they are read-modify-write operations or plain loads.The operationis then considered a read-acquire.Acquire semantics prevent memory reordering ofthe read-acquire with any read or write operation that follows it in program order.陈渝(清华大学)第 12 讲2020 年 5 月 9 日35/55.C+11 Memory Modelstd:memory_order_release/acquireRelease semanticsRelease semantics is a property that can only apply to operations that write to sharedmemory,whether they are read-modify-write operations or plain stores.The operationis then considered a write-release.Release semantics prevent memory reordering of thewrite-release with any read or write operation that precedes it in program order.陈渝(清华大学)第 12 讲2020 年 5 月 9 日36/55.C+11 Memory Modelstd:memory_order_release/acquire陈渝(清华大学)第 12 讲2020 年 5 月 9 日37/55.C+11 Memory Modelstd:memory_order_release/acquirevoid produce()payload=42;guard.store(1,std:memory_order_release)void consume(int iterations)for(int i=0;i iterations;i+)if(guard.load(std:memory_order_acquire)resulti=payload;陈渝(清华大学)第 12 讲2020 年 5 月 9 日38/55.C+11 Memory Modelstd:memory_order_release/acquirehttp:/ 12 讲2020 年 5 月 9 日39/55.C+11 Memory Modelstd:memory_order_release/acquire1000 iterations:http:/ 12 讲2020 年 5 月 9 日40/55.C+11 Memory Modelstd:memory_order_consume(Data Dependency Ordering)A consume load makes prior writes to data-dependent memorylocations made by the thread that did the release visible in the loading thread.Data Dependency陈渝(清华大学)第 12 讲2020 年 5 月 9 日41/55.C+11 Memory Modelstd:memory_order_consume(Data Dependency Ordering)A consume load makes prior writes to data-dependent memorylocations made by the thread that did the release visible in the loading thread.陈渝(清华大学)第 12 讲2020 年 5 月 9 日42/55.C+11 Memory Modelstd:memory_order_consume(Data Dependency Ordering)A consume load makes prior writes to data-dependent memorylocations made by the thread that did the release visible in the loading thread.陈渝(清华大学)第 12 讲2020 年 5 月 9 日43/55.C+11 Memory Modelstd:memory_order_consumex86-64 machine code loads Guard into register rcx,then,if rcx is not null,uses rcxto load the payload,thus creating a data dependency between the two loadinstructions.86-64s strong memory model already guarantees that loads are performedin-order,even if there isnt a data dependency.陈渝(清华大学)第 12 讲2020 年 5 月 9 日44/55.C+11 Memory Modelstd:memory_order_consumePowerPC machine code loads Guard into register r9,then uses r9 to load thepayload,thus creating a data dependency between the two load pletely avoid the”cmp;bne;isync”sequence of instructions that formed amemory barrier in the original example,while still ensuring that the two loads areperformed in-order.陈渝(清华大学)第 12 讲2020 年 5 月 9 日45/55.C+11 Memory Modelstd:memory_order_consumeARMV7 machine code loads Guard into register r4,then uses r4 to load thepayload,thus creating a data dependency between the two load pletely avoid the”dmb ish”instruction that was present in the originalexample,while still ensuring that the two loads are performed in-order.陈渝(清华大学)第 12 讲2020 年 5 月 9 日46/55.C+11 Memory Modelstd:memory_order_consume(Data Dependency Ordering)A consume load makes prior writes to data-dependent memorylocations made by the thread that did the release visible in the loading thread.陈渝(清华大学)第 12 讲2020 年 5 月 9 日47/55.C+11 Memory Modelstd:memory_order_consumereal world RCU在真实世界中使用这一技术利用数据依赖顺序以避免内存栅栏的例子就是 Linux 内核。Linux 提供了读-复制-更新（RCU）的同步机制，适合构建在多个线程中需要多读少写的共享变量(包括指针等)数据结构。Linux 实际上没有使用 C+11 消费语义来去除那些内存栅栏,而是依靠它自己的 API 和规范。其实起初 RCU 就被看作是给 C+11 添加消费语义的动机。陈渝(清华大学)第 12 讲2020 年 5 月 9 日48/55.C+11 Memory Modelstd:memory_order_relaxedThe memory_order_relaxed ensure these operations are atomic,but dont impose any orderingconstraints/memory barriers that arent already there.relaxed order 允许单线程上不同的 memory location 进行 reorder，但是对于同一个 memorylocation 不能进行 reorder。陈渝(清华大学)第 12 讲2020 年 5 月 9 日49/55.Lock-Free Programming陈渝(清华大学)第 12 讲2020 年 5 月 9 日50/55.Lock-Free Programming陈渝(清华大学)第 12 讲2020 年 5 月 9 日51/55.Lock-Free ProgrammingLock-free Stacktemplateclass lock_free_stackprivate:struct nodeT data;node*next;node(T const&data_):/1data(data_);陈渝(清华大学)第 12 讲2020 年 5 月 9 日52/55.Lock-Free ProgrammingLock-free Stackstd:atomic head;public:void push(T const&data)node*const new_node=new node(data);/2new_node-next=head.load();/3while(!pare_exchange_weak(new_node-next,new_node);/4使用“比较/交换”操作：返回 false 时，因为比较失败(例如，head 被其他线程修改)，会使用 head 中的内容更新 new_node-next(第一个参数)的内容。陈渝(清华大学)第 12 讲2020 年 5 月 9 日53/55.Lock-Free ProgrammingLock-free Stackwhile(!pare_exchange_weak(new_node-next,new_node);/4if(head=new_node-next)head=new_node;return true;elsenew_node-next=head;return false;使用“比较/交换”操作：返回 false 时，因为比较失败(例如，head 被其他线程修改)，会使用 head 中的内容更新 new_node-next(第一个参数)的内容。陈渝(清华大学)第 12 讲2020 年 5 月 9 日54/55.Lock-Free ProgrammingLock-free Stackvoid pop(T&result)node*old_head=head.load();while(!pare_exchange_weak(old_head,old_head-next);result=old_head-data;这段代码很优雅，但有关于节点泄露的两个问题。首先，这段代码在空链表时不工作：当head 指针是空指针时，要访问 next 指针时，将引起未定义行为。陈渝(清华大学)第 12 讲2020 年 5 月 9 日55/55

注意事项

本文（高级操作系统高级操作系统 (34).pdf）为本站会员（刘静）主动上传，淘文阁 - 分享文档赚钱的网站仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知淘文阁 - 分享文档赚钱的网站（点击联系客服），我们立即给予删除！

温馨提示：如果因为网速或其他原因下载失败请重新下载，重复下载不扣分。