AI人工智能导论 (28).pdf

上传人：奉***

文档编号：4060283

上传时间：2021-01-13

格式：PDF

页数：9

大小：709.03KB

( 4.5 )

《AI人工智能导论 (28).pdf》由会员分享，可在线阅读，更多相关《AI人工智能导论 (28).pdf（9页珍藏版）》请在淘文阁 - 分享文档赚钱的网站上搜索。

1、19 5. Adversarial Search Contents: 5.1. Games 5.2. Optimal Decisions in Games 5.3. Alpha-Beta Pruning 5.4. Imperfect Real-time Decisions 5.5. Stochastic Games 5.6. Monte-Carlo Methods 20 In normal search普通搜索 The optimal solution would be a sequence of actions leading to a goal state (terminal state)

2、 that is a win. 最优解将是导致获胜的目标状态（终端状态）的一系列动作。 In adversarial search 对抗搜索 Both of MAXand MINcould have an optimal strategy. MAX和MIN都会有一个最优策略。 In initial state, MAXmust find a strategy to specify MAXs move, 在初始状态，MAX必须找到一个策略来确定MAX的动作， then MAXs moves in the states resulting from every possible response

3、by MIN, and so on. 然后MAX针对MIN的每个合理的对应采取相应的动作，以此类推。 Optimal Solution 最优解 5.2. Optimal Decisions in Games Artificial Intelligence : Searching : Adversarial Search 21 For a zero sum game, the name minimax arises because each player minimizes the maximum payoff possible for the other, he also minimizes

4、his own maximum loss. 对于零和博弈来说，其名称minimax的由来是因为每个玩家会使对手可能的最大收益变得最小，还会使自己的最大损失变得最小。 Minimax Theorem 最小最大定理 5.2. Optimal Decisions in Games For every two-player, zero-sum game with finitely many strategies, there exists a value V and a mixed strategy for each player, such that 对于两个玩家、具有有限多个策略的零和博弈，每个

5、玩家存在一个值V和一个混合策略，使得： (a) Given player 2s strategy, the best payoff possible for player 1 is V, 给定玩家2的策略，则玩家1可能的最好收益是V， (b) Given player 1s strategy, the best payoff possible for player 2 is V. 给定玩家1的策略，则玩家2可能的最好收益是-V。 Artificial Intelligence : Searching : Adversarial Search 22 Given a game tree, the

6、optimal strategy can be determined from the minimax value of each node, write as MINIMAX(n). 给定一棵博弈树，则最优策略可以由每个节点的minimax值来确定，记作MINIMAX(n)。 Assume that both players play optimally from there to the end of the game. 假设两个玩家博弈自始至终都发挥得很好。 Optimal Solution in Adversarial Search 对抗搜索的最优解 5.2. Optimal Deci

7、sions in Games The minimax value of a terminal state is just its utility. MAXprefers to move to a state of maximum value, MINprefers a state of minimum value. 终端状态的minimax值只是其效用。 MAX倾向于移动到一个最大值状态，MIN则倾向于一个最小值状态。 function MINIMAX(s) returns an action if TERMINAL-TEST(s) then return UTILITY(s) if PLAY

8、ER(s) = MAXthen return maxa ACTIONS(s)MINIMAX(RESULT(s, a) if PLAYER(s) = MINthen return mina ACTIONS(s)MINIMAX(RESULT(s, a) Artificial Intelligence : Searching : Adversarial Search 23 MAXs best move at root is a1(with the highest minimax value) 根节点处MAX的最佳移动是a1 （具有最高的minimax值） MINs best reply at B i

9、s b1(with the lowest minimax value) B节点处MIN的最佳应对是b1 （具有最低的minimax值） Minimax Decision - A Two-player Game Tree 一个双人玩家的博弈树 5.2. Optimal Decisions in Games A CBD a1 a2 a3 b2 b1 b3c3c1 c2 d2 d1d3 31282461452 32 2 3MAX MIN Artificial Intelligence : Searching : Adversarial Search 24 Minimax Algorithm 最小最大

10、算法 5.2. Optimal Decisions in Games function MINIMAX-DECISION(state) returns an action return argmaxa ACTIONS(s)MIN-VALUE(RESULT(state, a) function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v for each a inACTIONS(state) do v MAX(v, MIN-VALUE(RESULT(st

11、ate, a) return v function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v + for each a inACTIONS(state) do v MIN(v, MAX-VALUE(RESULT(state, a) return v Artificial Intelligence : Searching : Adversarial Search 25 Time complexity 时间复杂性 Properties of Minima

12、x Decision 最小最大决策的性质 5.2. Optimal Decisions in Games The minimax algorithm performs a depth-first exploration of the game tree. 最小最大算法表现为博弈树的深度优先探索。 Space complexity 空间复杂性 O(bm) -The algorithm generates all actions at once 算法同时生成所有动作 O(bm) - The algorithm generates actions one at a time 算法一次生成一个动作 O

13、(m) b - The branching factor (legal moves at each point) 分支因子（每个点的合法走子） m - The maximum depth of any node 任一节点的最大深度 where Artificial Intelligence : Searching : Adversarial Search 26 Let us examine how to extend the minimax idea to multiplayer games. 让我们考察一下如何将minimax思想扩展到多玩家博弈中。 We need to replace s

14、ingle value for each node with a vector of values. 我们需要将每个节点的单一值替换为一个值的向量。 E.g., in a three-player game with players A, B, and C, a vector (vA, vB, vC) is associated with each node. 例如，对于具有A、B、C三个玩家的博弈，将向量(vA, vB, vC)与每个节点相关联。 For terminal states, this vector gives the utility of the state from each

15、 players viewpoint. The simplest way to implement this is to have the UTILITYfunction return a vector of utilities. 对于终端状态，从每个玩家的角度来看，这个向量给定该状态的效用。实现的最简单方法是拥有一个返回效用向量的UTILITY函数。 Optimal Decisions in Multi-player Games 多玩家博弈中的最优决策 5.2. Optimal Decisions in Games 27 Multiplayer games usually involve

16、formal or informal alliances among the players. 多玩家博弈通常在玩家之间涉及正式的或非正式的联盟。 Alliances are made and broken as the game proceeds. 联盟随着博弈收益的变化建立或者破裂。 Optimal Decisions in Multi-player Games 多玩家博弈中的最优决策 5.2. Optimal Decisions in Games (vC=6) (vC=3)(vC=2) (vC=1)(vC=1) (vC=2)(vC=1) (vB=1)(vB=5) (vB=4) (vA=1) = (vA=1) (1, 2, 6) (1, 2, 6) (1, 2, 6) (1, 2, 6)(4, 2, 3) (6, 1, 2) (6, 1, 2)(7, 4, 1)(5, 1, 1)(1, 5, 2) (1, 5, 2) (7, 7, 1)(5, 4, 5) (5, 4, 5) (1, 5, 2) A B C

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

2 金币

版权申诉 word格式文档无特别注明外均可编辑修改；预览文档经过压缩，下载后原文更清晰！ 立即下载

配套讲稿：: 如PPT文件的首页显示word图标，表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
特殊限制：: 部分文档作品中含有的国旗、国徽等图片，仅作为作品整体效果示例展示，禁止商用。设计者仅对作品中独创性部分享有著作权。
关键词：: 人工智能导论

淘文阁 - 分享文档赚钱的网站所有资源均是用户自行上传分享，仅供网友学习交流，未经上传用户书面授权，请勿作他用。

限制150内

关于本文

本文标题：AI人工智能导论 (28).pdf
链接地址：https://www.taowenge.com/p-4060283.html