机器学习概论机器学习概论 (7).pdf
《机器学习概论机器学习概论 (7).pdf》由会员分享,可在线阅读,更多相关《机器学习概论机器学习概论 (7).pdf(28页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、2019/4/191Welcome to Introduction to Machine Learning!2010.3.51*Images come from InternetCoffee TimeWhat makes a good experimental report?ByTA:ChenyangWang2019/4/192Topic 8:Support Vector Machine?Outline Introduction to machine learning:SVM4BackgroundLinear support vector machineKernel support vecto
2、r machineAppendixMin Zhang2019/4/193ClassificationIntroduction to machine learning:SVM5Pos:This skirt is so beautiful!Neg:It looks ugly on my body.Neural:It is pure cotton.Rock:Michael Jackson“Beat it”,Hip Hop:Eminem“Lose yourself”Blues:Muddy Waters“I cant be satisfied”Classication methodsIntroducti
3、on to machine learning:SVM6Decision treeattributes of instances are nominal dataobjective function are discreteK-nearest neighborinstances are points in the(e.g.Euclidean)spaceobjective function can be discrete or continuousSupport vector machineinstances are points in the(e.g.Euclidean)spaceobjecti
4、ve function can be discrete or continuous2019/4/194Background messageIntroduction to machine learning:SVM7The present form of support vector machine(SVM)was largely developed at AT&T Bell Laboratories by Vapnik and co-workers.Known as a maximum margin classier.Originally proposed for classification
5、and soon applied toregression and time series prediction.One of the most efficient supervised learning methods.Been used as strong baseline of text processing approaches.Outline Introduction to machine learning:SVM8BackgroundLinear support vector machineMax margin linear classifierDual problem formu
6、lationLinearly non-separable caseKernel support vector machineAppendix2019/4/195Problem Introduction to machine learning:SVM9Given a set of training samples!,$,!%,$%,!&,$&,!)*,$1,1,find a function-(!,/)to classify the samples,such thatwhere/denotes the parametersFor a testing sample!,we can predict
7、its label by sign-!,/-!,/=0 is called the separation hyperplaneLinear classifiersIntroduction to machine learning:SVM10Linear hyperplaneConsider the linearly separable case,there are infinite number of hyperplanes that can do the jobAny of these lines would be fine.but which is the best one?How woul
8、d you classify this data?2019/4/196Margin of a linear classierIntroduction to machine learning:SVM11Let two hyperplanes parallel to the separation hyperplane on the two sides of the separation hyperplane,respectively,move away from the separation hyperplane When they first hit two data points,respec
9、tively,the distance between them is called the margin of the linear classifier Margin(?):The width that the boundary could be increased by before hitting a data point.Maximum margin linear classifier?Introduction to machine learning:SVM12Definition:the linear classier with the maximum margin.Support
10、 vectors:?those datapointsthat the margin pushes up against2019/4/197Problem formulationIntroduction to machine learning:SVM13To formulate the margin,we further requires that for all samplesOrWe have introduced two additional hyperplanes!,#+%=1 parallel to the separation hyperplane!,#+%=0Problem for
11、mulationIntroduction to machine learning:SVM14What is the margin?The distance between the two new hyperplanes.What is the expression of the margin?Denote:?The minimum distance between the hyperplane!,#+%=1 and the origin by()The minimum distance between the hyperplane!,#+%=1 and the origin by(+The m
12、argin is|()(+|2019/4/198Problem formulationIntroduction to machine learning:SVM15How to calculate!and!#?Note%=!/#,where/#is the unit vector along the direction Since%is on the blue hyperplane,then!/#,+=1which follows!=-./0Similarly we obtain!#=-./0The margin is!#=#20Problem formulationIntroduction t
13、o machine learning:SVM16The optimization problemOr equivalently Although it seems that the margin is only decided by w,b also affects the margin implicitly via its impact on w in the constrain.2019/4/199Outline Introduction to machine learning:SVM17BackgroundLinear support vector machineMax margin l
14、inear classifierDual problem formulationLinearly non-separable caseKernel support vector machineAppendixDual problem formulation?Introduction to machine learning:SVM18Primal problemLagrange function?KKT conditions(Karush-Khun-Tucker,KKT?)2019/4/1910Dual problem formulationIntroduction to machine lea
15、rning:SVM19Substitute the results into!(#,%,&)and get(try by yourself)the dual problemmaxSupport vectorsSupport vectorsIntroduction to machine learning:SVM21According to the KKT condition!is nonzero only if#$,&+(=1,i.e.,&lies on the boundaries of the marginThese&s are support vectors(SV)Most!s are z
16、ero Then(xi,yi)has no impact on f(x)Sparse solution2019/4/1911Solution to the primal problem(by dual problem)Introduction to machine learning:SVM22Normal vectorBiasHyperplaneNote that!is sparseThe hyperplane is only determined by SVs!Summary so farIntroduction to machine learning:SVM23SVM in the lin
17、early separable caseMaximize marginSVs:their corresponding!0 Primal problem:Dual problem:2019/4/1912Outline Introduction to machine learning:SVM24BackgroundLinear support vector machineMax margin linear classifierDual problem formulationLinearly non-separable caseKernel support vector machineAppendi
18、xLinearly non-separable caseIntroduction to machine learning:SVM25Recall in the linearly separable caseEnsuring zero training classification errorIn the non-separable case,there must be errors.We minimize!as well as the training classification error!where#0is a constant to balance the two terms2019/
19、4/1913!011-12#/%!Loss functions?0/1 loss and Hinge lossIntroduction to machine learning:SVM26&()*!Recall a correct predictionDefineFor each sample+0/1 loss:Hinge loss:PS:separation hyperplane is,+/=0,i.e.,!=0Linearly penalize!1More loss functionsIntroduction to machine learning:SVM27Three common los
20、s functions to replace 0/1 loss:Hinge lossExponential lossLogistic loss2019/4/1914New formulationwhereIntroduce slack variables(?)!0.It becomes Formulation with hinge lossIntroduction to machine learning:SVM28Minimizing Hinge lossmax(0,1 ,)is equivalent to minimizing!subject to!1 ,!0Compare with the
21、 separable case:New variableSoft marginIntroduction to machine learning:SVM29Still want to find the maximum margin hyperplane,but this time:We allow some training examples to be misclassifiedWe allow some training examples to fall within the margin region2019/4/1915Soft marginIntroduction to machine
22、 learning:SVM30For!=0,the data point falls on the boundaries of the region of separation or outside the region of separation and on the right side of the decision surface.For 0 1,the data point falls on the wrong side of the separating hyperplane and introduce a wrong decision.Soft marginIntroductio
23、n to machine learning:SVM31The positive constant C controls the balance between large margin and small misclassification errorStructure risk(?)vs.empirical risk(?large C:prefer small errorsmall C:prefer large margin2019/4/1916Dual problemIntroduction to machine learning:SVM32The dual problem in the
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 机器学习概论机器学习概论 7 机器 学习 概论
限制150内