《算法设计与分析算法设计与分析 (9).ppt》由会员分享,可在线阅读,更多相关《算法设计与分析算法设计与分析 (9).ppt(46页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、1Chapter 6Dynamic ProgrammingSlides by Kevin Wayne.Copyright 2005 Pearson-Addison Wesley.All rights reserved.2Algorithmic ParadigmsGreed.Build up a solution incrementally,myopically optimizing some local criterion.Divide-and-conquer.Break up a problem into two sub-problems,solve each sub-problem ind
2、ependently,and combine solution to sub-problems to form solution to original problem.Dynamic programming.Break up a problem into a series of overlapping sub-problems,and build up solutions to larger and larger sub-problems.Bellman.Pioneered the systematic study of dynamic programming in the 1950s.3D
3、ynamic Programming ApplicationsAreas.nBioinformatics.nControl theory.nInformation theory.nOperations research.nComputer science:theory,graphics,AI,systems,.Some famous dynamic programming algorithms.nViterbi for hidden Markov models.nUnix diff for comparing two files.nSmith-Waterman for sequence ali
4、gnment.nBellman-Ford for shortest path routing in networks.nCocke-Kasami-Younger for parsing context free grammars.6.1 Weighted Interval Scheduling5Weighted Interval SchedulingWeighted interval scheduling problem.nJob j starts at sj,finishes at fj,and has weight or value vj.nTwo jobs compatible if t
5、hey dont overlap.nGoal:find maximum weight subset of mutually compatible jobs.Time01234567891011fgheabcd6Unweighted Interval Scheduling ReviewRecall.Greedy algorithm works if all weights are 1.nConsider jobs in ascending order of finish time.nAdd job to subset if it is compatible with previously cho
6、sen jobs.Observation.Greedy algorithm can fail spectacularly if arbitrary weights are allowed.Time01234567891011baweight=999weight=17Weighted Interval SchedulingNotation.Label jobs by finishing time:f1 f2 .fn.Def.p(j)=largest index i j such that job i is compatible with j.Ex:p(8)=5,p(7)=3,p(2)=0.Tim
7、e01234567891011678431258Dynamic Programming:Binary ChoiceNotation.OPT(j)=value of optimal solution to the problem consisting of job requests 1,2,.,j.nCase 1:OPT selects job j.cant use incompatible jobs p(j)+1,p(j)+2,.,j-1 must include optimal solution to problem consisting of remaining compatible jo
8、bs 1,2,.,p(j)nCase 2:OPT does not select job j.must include optimal solution to problem consisting of remaining compatible jobs 1,2,.,j-1optimal substructure9Weighted Interval Scheduling:Brute Force10Weighted Interval Scheduling:Brute ForceObservation.Recursive algorithm fails spectacularly because
9、of redundant sub-problems exponential algorithms.Ex.Number of recursive calls for family of layered instances grows like Fibonacci sequence.34512p(1)=0,p(j)=j-254332212110101011Weighted Interval Scheduling:MemoizationMemoization.Store results of each sub-problem in a cache;lookup as needed.12Weighte
10、d Interval Scheduling:Running TimeClaim.Memoized version of algorithm takes O(n log n)time.nSort by finish time:O(n log n).nComputing p():O(n logn)via binary search.nM-Compute-Opt(j):each invocation takes O(1)time and either(i)returns an existing value Mj(ii)fills in one new entry Mj and makes two r
11、ecursive callsnProgress measure =#nonempty entries of M.initially =0,throughout n.(ii)increases by 1 at most 2n recursive calls.nOverall running time of M-Compute-Opt(n)is O(n).13Weighted Interval Scheduling:Finding a SolutionQ.Dynamic programming algorithms computes optimal value.What if we want th
12、e solution itself?A.Do some post-processing.n#of recursive calls n O(n).14Weighted Interval Scheduling:Bottom-UpBottom-up dynamic programming.Unwind recursion.6.3 Segmented Least Squares16Segmented Least SquaresLeast squares.nFoundational problem in statistic and numerical analysis.nGiven n points i
13、n the plane:(x1,y1),(x2,y2),.,(xn,yn).nFind a line y=ax+b that minimizes the sum of the squared error:Solution.Calculus min error is achieved whenxy17Segmented Least SquaresSegmented least squares.nPoints lie roughly on a sequence of several line segments.nGiven n points in the plane(x1,y1),(x2,y2),
14、.,(xn,yn)with nx1 x2 .xn,find a sequence of lines that minimizes f(x).Q.Whats a reasonable choice for f(x)to balance accuracy and parsimony?xygoodness of fitnumber of lines18Segmented Least SquaresSegmented least squares.nPoints lie roughly on a sequence of several line segments.nGiven n points in t
15、he plane(x1,y1),(x2,y2),.,(xn,yn)with nx1 x2 .0.xy19Dynamic Programming:Multiway ChoiceNotation.nOPT(j)=minimum cost for points p1,p2,.,pj.ne(i,j)=minimum sum of squares for points pi,pi+1,.,pj.To compute OPT(j):nLast segment uses points pi,pi+1,.,pj for some i.nCost=e(i,j)+c+OPT(i-1).20Segmented Le
16、ast Squares:AlgorithmRunning time.O(n3).nBottleneck=computing e(i,j)for O(n2)pairs,O(n)per pair using previous formula.6.4 Knapsack Problem22Knapsack ProblemKnapsack problem.nGiven n objects and a knapsack.nItem i weighs wi 0 and has value vi 0.nKnapsack has capacity of W.nGoal:fill knapsack so as t
17、o maximize total value.Ex:3,4 has value 40.Greedy:repeatedly add item with maximum ratio vi/wi.Ex:5,2,1 achieves only value=35 greedy not optimal.1Value1822281Weight56627Item13452W=1123Dynamic Programming:Adding a New VariableDef.OPT(i,w)=max profit subset of items 1,i with weight limit w.nCase 1:OP
18、T does not select item i.OPT selects best of 1,2,i-1 using weight limit w nCase 2:OPT selects item i.new weight limit=w wiOPT selects best of 1,2,i1 using this new weight limit24Knapsack Problem:Bottom-Up25Knapsack Algorithmn+11Value1822281Weight56627Item13452 1,2 1,2,3 1,2,3,4 1 1,2,3,4,5 000000010
19、11111206661630777174077717507181811860719221227072424128807252812990725291341007252913411072540140W+1W=11OPT:4,3 value=22+18=4026Knapsack Problem:Running TimeRunning time.(n W).nNot polynomial in input size!nPseudo-polynomial.nDecision version of Knapsack is NP-complete.Chapter 8Knapsack approximati
20、on algorithm.There exists a polynomial algorithm that produces a feasible solution that has value within 0.01%of optimum.Section 11.86.5 RNA Secondary Structure28RNA Secondary StructureRNA.String B=b1b2bn over alphabet A,C,G,U.Secondary structure.RNA is single-stranded so it tends to loop back and f
21、orm base pairs with itself.This structure is essential for understanding behavior of molecule.GUCAGAAGCGAUGAUUAGACAACUGAGUCAUCGGGCCGEx:GUCGAUUGAGCGAAUGUAACAACGUGGCUACGGCGAGAcomplementary base pairs:A-U,C-G29RNA Secondary StructureSecondary structure.A set of pairs S=(bi,bj)that satisfy:nWatson-Crick
22、.S is a matching and each pair in S is a Watson-Crick complement:A-U,U-A,C-G,or G-C.nNo sharp turns.The ends of each pair are separated by at least 4 intervening bases.If(bi,bj)S,then i j-4.nNon-crossing.If(bi,bj)and(bk,bl)are two pairs in S,then we cannot have i k j l.Free energy.Usual hypothesis i
23、s that an RNA molecule will form the secondary structure with the optimum total free energy.Goal.Given an RNA molecule B=b1b2bn,find a secondary structure S that maximizes the number of base pairs.approximate by number of base pairs30RNA Secondary Structure:ExamplesExamples.CGGCAGUUUAAU G U GGCCA UG
24、GCAGUUAAU GGGCA UCGGCAUGUUAAG U U GGCCA Usharp turncrossingokGG4base pair31RNA Secondary Structure:SubproblemsFirst attempt.OPT(j)=maximum number of base pairs in a secondary structure of the substring b1b2bj.Difficulty.Results in two sub-problems.nFinding secondary structure in:b1b2bt-1.nFinding se
25、condary structure in:bt+1bt+2bn-1.1tnmatch bt and bnOPT(t-1)need more sub-problems32Dynamic Programming Over IntervalsNotation.OPT(i,j)=maximum number of base pairs in a secondary structure of the substring bibi+1bj.nCase 1.If i j-4.OPT(i,j)=0 by no-sharp turns condition.nCase 2.Base bj is not invol
26、ved in a pair.OPT(i,j)=OPT(i,j-1)nCase 3.Base bj pairs with bt for some i t j-4.non-crossing constraint decouples resulting sub-problemsOPT(i,j)=1+maxt OPT(i,t-1)+OPT(t+1,j-1)take max over t such that i t j-4 andbt and bj are Watson-Crick complements33Bottom Up Dynamic Programming Over IntervalsQ.Wh
27、at order to solve the sub-problems?A.Do shortest intervals first.Running time.O(n3).0000002341i6789j34Dynamic Programming SummaryRecipe.nCharacterize structure of problem.nRecursively define value of optimal solution.nCompute value of optimal solution.nConstruct optimal solution from computed inform
28、ation.Dynamic programming techniques.nBinary choice:weighted interval scheduling.nMulti-way choice:segmented least squares.nAdding a new variable:knapsack.nDynamic programming over intervals:RNA secondary structure.Top-down vs.bottom-up:different people have different intuitions.6.6 Sequence Alignme
29、nt36String SimilarityHow similar are two strings?nocurrancenoccurrenceocurranceccurrenceo-ocurrnceccurrnceo-ae-ocurranceccurrenceo-5 mismatches,1 gap1 mismatch,1 gap0 mismatches,3 gaps37Applications.nBasis for Unix diff.nSpeech recognition.nComputational biology.Edit distance.Levenshtein 1966,Needle
30、man-Wunsch 1970nGap penalty;mismatch penalty pq.nCost=sum of gap and mismatch penalties.2+CACGACCTACCTCTGACTACATTGACCTACCTCTGACTACAT-TCCCTC+GT+AG+2CA-Edit Distance38Goal:Given two strings X=x1 x2.xm and Y=y1 y2.yn find alignment of minimum cost.Def.An alignment M is a set of ordered pairs xi-yj such
31、 that each item occurs in at most one pair and no crossings.Def.The pair xi-yj and xi-yj cross if i j.Ex:CTACCG vs.TACATG.Sol:M=x2-y1,x3-y2,x4-y3,x5-y4,x6-y6.Sequence AlignmentCTACC-TACAT-GGy1y2y3y4y5y6x2x3x4x5x1x639Sequence Alignment:Problem StructureDef.OPT(i,j)=min cost of aligning strings x1 x2.
32、xi and y1 y2.yj.nCase 1:OPT matches xi-yj.pay mismatch for xi-yj +min cost of aligning two stringsx1 x2.xi-1 and y1 y2.yj-1 nCase 2a:OPT leaves xi unmatched.pay gap for xi and min cost of aligning x1 x2.xi-1 and y1 y2.yj nCase 2b:OPT leaves yj unmatched.pay gap for yj and min cost of aligning x1 x2.
33、xi and y1 y2.yj-1 40Sequence Alignment:AlgorithmAnalysis.(mn)time and space.6.7 Shortest Paths42Shortest PathsShortest path problem.Given a directed graph G=(V,E),with edge weights cvw,find shortest path from node s to node t.Ex.s3t267451018-169 615-8 30 204416116196allow negative weights43Shortest
34、Paths:Failed AttemptsDijkstra.Can fail if negative edge costs.Re-weighting.Adding a constant to every edge weight can fail.utsv2 13-6st2 32-335566044Shortest Paths:Negative Cost CyclesNegative cost cycle.Observation.If some path from s to t contains a negative cost cycle,there does not exist a short
35、est s-t path;otherwise,there exists one that is simple.stWc(W)0-6 7-445Shortest Paths:Dynamic ProgrammingDef.OPT(i,v)=length of shortest v-t path P using at most i edges.nCase 1:P uses at most i-1 edges.OPT(i,v)=OPT(i-1,v)nCase 2:P uses exactly i edges.if(v,w)is first edge,then OPT uses(v,w),and then selects best w-t path using at most i-1 edgesRemark.By previous observation,if no negative cycles,thenOPT(n-1,v)=length of shortest v-t path.46Shortest Paths:ImplementationAnalysis.(mn)time,(n2)space.Finding the shortest paths.Maintain a successor for each table entry.
限制150内