人工智能11监督学习33384.pptx
第11章 监督学习剩余课程安排本周是第八周,周四结束所有课程第八周第2节结束实验三验收第9周考试,考试时间11.018周.周日晚上交实验三的报告大纲分类逻辑斯特回归支持向量机SVM神经网络Bagging and Boosting回归自主学习课程论文3:SVR实例Logistic回归基本理论和方法回归基本理论和方法研究某一事件发生的概率P=P(y=1)与若干因素之间的关系 在在0和和1之间之间任意范围之间的数量任意范围之间的数量若干个状态的标量若干个状态的标量logistic变换logistic变换Logistic回归模型 优势比优势比概率概率p的预测的预测P与多因素之间的关系预测P与单特征之间的关系图与单特征之间的关系图px1最可能成功范围最可能成功范围最不可能成功范围最不可能成功范围LR模型公式二分类:多项逻辑斯特回归(Multinomial logistic regression,MLR):注意:在实际分类应用时,因为特征向量往往维数较多,标签值的分类,优先选择MLR模型支持向量机SVM在机器学习领域,支持向量机SVM(Support Vector Machine)是一个有监督的学习模型,通常用来进行模式识别、分类、以及回归分析。线性分类器Binary classification can be viewed as the task ofseparating classes in feature space(特征空间):Linear ClassificationWhich of the linear separators is optimal?Classification Margin(间距)Classification Margin(间距)Maximum Margin Classification最大间距分类Maximizing the margin is good according to intuition and PAC theory.意味着分类时只考虑支持向量,其他的训练样本被忽略。Maximum Margin Classification最大间距分类Maximizing the margin is good according to intuition and PAC theory.意味着分类时只考虑支持向量,其他的训练样本被忽略。Maximum Margin ClassificationMathematicallyMaximum Margin ClassificationMathematically非线性SVMs非线性SVMs:特征空间课堂讨论神经网络问题求解求解线性子问题激活函数图形化描述求解过程图形化描述求解过程图形化描述求解过程深度神经网络Baggingbagging是一种用来提高学习算法准确度的方法,这种方法通过构造一个预测函数系列,然后以一定的方式将它们组合成一个预测函数基本思想1.给定一个弱学习算法,和一个训练集;2.单个弱学习算法准确率不高;3.将该学习算法使用多次,得出预测函数序列,进行投票;4.最后结果准确率将得到提高Bagging算法1.For t=1,2,T Do从数据集S中取样(放回选样)训练得到模型Ht对未知样本X分类时,每个模型Ht都得出一个分类,得票最高的即为未知样本X的分类2.也可通过得票的平均值用于连续值的预测要求的分类算法Bagging要求“不稳定”(不稳定是指数据集的小的变动能够使得分类结果的显著的变动)的分类方法。比如:决策树,神经网络算法Boosting发展历史BoostingAdaBoostAdaBoost回归多项式最小二乘回归支持向量回归SVR神经网络深度神经网络。实例淘宝母婴预测分组12345年龄段1 to 6 months6 to 12 months1 to 3 years3 to 6 yearsolder than 6 years预测模型整体框架特征提取商品编号一级类目叶子类目属性1属性21服装裤子适用年龄:2岁型号:L2食物奶粉适用年龄:3到6个月3用品奶瓶适用年龄:2岁容量:250ml分类器分类器选择:MLR评价标准 Micro F1值实验结果Age groupMicro F1123450.150.250.350.450.550.650.750.850.953M-LR,0.6590930013M-LR,0.4801410993M-LR,0.7321661653M-LR,0.4803703553M-LR,0.1848774663M-MLR,0.6647449413M-MLR,0.5167387873M-MLR,0.7329999713M-MLR,0.4986583493M-MLR,0.2603335341Y-MLR,0.7017696251Y-MLR,0.5972112851Y-MLR,0.7775682371Y-MLR,0.5765581351Y-MLR,0.3682079291Y-TMLR,0.7485161021Y-TMLR,0.750721881Y-TMLR,0.8442905341Y-TMLR,0.6840542751Y-TMLR,0.5568606651Y-3TBP-MLR,0.7589963211Y-3TBP-MLR,0.7798076931Y-3TBP-MLR,0.8752261831Y-3TBP-MLR,0.6830480141Y-3TBP-MLR,0.620903483M-LR3M-MLR1Y-MLR1Y-TMLR1Y-3TBP-MLR总体准确率:78.2%70%覆盖率下各年龄段上的Micro F1值Age groupMicro F1123450.450.550.650.750.850.951Y-TMLR under 100%coverage,0.7485161021Y-TMLR under 100%coverage,0.750721881Y-TMLR under 100%coverage,0.8442905341Y-TMLR under 100%coverage,0.6840542751Y-TMLR under 100%coverage,0.5568606651Y-3TBP-MLR under 100%coverage,0.7589963211Y-3TBP-MLR under 100%coverage,0.7798076931Y-3TBP-MLR under 100%coverage,0.8752261831Y-3TBP-MLR under 100%coverage,0.6830480141Y-3TBP-MLR under 100%coverage,0.620903481Y-3TBP-MLR under 70%coverage,0.8522697351Y-3TBP-MLR under 70%coverage,0.8413913521Y-3TBP-MLR under 70%coverage,0.9176177881Y-3TBP-MLR under 70%coverage,0.7873932461Y-3TBP-MLR under 70%coverage,0.7159585451Y-TMLR under 100%coverage1Y-3TBP-MLR under 100%coverage1Y-3TBP-MLR under 70%coverage