机械学习ー.pptx
![资源得分’ title=](/images/score_1.gif)
![资源得分’ title=](/images/score_1.gif)
![资源得分’ title=](/images/score_1.gif)
![资源得分’ title=](/images/score_1.gif)
![资源得分’ title=](/images/score_05.gif)
《机械学习ー.pptx》由会员分享,可在线阅读,更多相关《机械学习ー.pptx(81页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、機械学習機械学習n岡田孝TUT 2000/06/071n知識発見知的解析n紙缶n大容量nn対象多様化Web上TUT 2000/06/072知識発見知識発見基幹基幹DBDB抽出抽出変換変換統合統合外部外部DBDB評価評価可視化可視化知識知識TUT 2000/06/073技法技法n統計学n認識nnn決定木nRoughsetn相関nGraphBasedInductionn帰納論理n変数選択n可視化TUT 2000/06/074What is supervised learning?nInput instances contains Class attributesExplanation attrib
2、utes.nGenerate rules to describe class descriptions inductively.IF conditions THEN classnLearning from examples,Incorporation of background knowledgecf.regression,discriminant analysis,neural network,nearest neighborTUT 2000/06/075Typical applicationsnKnowledge acquisition to be used in plant operat
3、ing expert systemnAction prediction of opponent teams in sports matchnDiagnosis from medical testsnDiscovery of active motifs in chemical compounds from structure activity relationship datasetsTUT 2000/06/076Classification of ProblemsTUT 2000/06/077Streams in learning researchI.ClassificationTUT 200
4、0/06/078決定木方法決定木方法TUT 2000/06/079決定木決定木高,赤,青:茶茶青青赤赤黒黒低,黒,青:高,黒,青:高,黒,茶:低,青:高,青:高,茶:低,茶:髪色目色TUT 2000/06/0710平均情報量変数選択平均情報量変数選択n平均情報量平均情報量n分類前分類前TUT 2000/06/0711分類平均情報量分類平均情報量利得利得n身長分類0.003bitn髪色分類0.454bitn眼色分類0.347bitTUT 2000/06/0712数値属性間結合糖尿病診断木数値属性間結合糖尿病診断木TUT 2000/06/0713Progress in Decision Treen
5、Variable with continuous valuesnEntropy gain ratio,Gini indexnSamplingnPruningnBagging,BoostingnUser interfaceuInteractive expansion of a treeuVisualizationuRulesTUT 2000/06/0714Gini-index=Pi(1-Pi)=1-Pi2Gini index vs.EntropyTUT 2000/06/0715決定木方法決定木方法n秋葉秋葉,金田金田:例学習技術応例学習技術応用向用向,情報処理学会誌情報処理学会誌,Vol.39,
6、No.2,pp.145-151;No.3,pp.245-251(1998).nBreiman,L.,Friedman,J.H.,Olshen,R.A.&Stone,C.J.:Classification and Regression Trees,The Wadsworth&Brooks/Cole(1984).CARTnQuinlan,J.R.:C4.5:Programs for Machine Learning,Morgan Kaufmann(1993).古川訳古川訳:AIAI解析解析,(1995).TUT 2000/06/0716Streams in learning researchIII
7、.Rough set nCharacteristicsuNon exploratoryuMethodology for decision tableuAnalysis of variable dependencies uNP hard to attributes&valuesnReferencesuPawlak,Z.:Rough Sets:Theoretical Aspects of Reasoning about Data,Kluwer Academic Publishers(1991).uW.Ziarko:Review of Basics of Rough Sets in the Cont
8、ext of Data Mining,Proc.Fourth International Workshop on Rough Sets,Fuzzy Sets,and Machine Discovery,pp.447-457,Tokyo(1996).uDatalogic/R:Reduct Systems Inc.TUT 2000/06/0717Rough setPositive regionBoundary regionNegative regionTUT 2000/06/0718計算過程計算過程:離散化分類離散化分類Reduct1=Size,Height,EnergyReduct2=Size,
9、Height,Current Core=Size,HeightTUT 2000/06/0719説明変数説明変数P目的変数目的変数QP=Size,Height,Energy,CurrentQ=TemperatureReduct1(P,Q)=Height,EnergyReduct2(P,Q)=Height,CurrentCore(P,Q)=HeightTUT 2000/06/0720計算過程計算過程 :Decision matrixRule導出導出B B1 11 1=(S,1)(S,1)(E,2)(E,2)(C,1)(C,1)(H,0)(H,0)(E,2)(E,2)(C,1)(C,1)(E,2)(
10、E,2)(C,1)(C,1)=(E,2)=(E,2)(C,1)(C,1)B B1 12 2=(H,2)(H,2)(C,1)(C,1)(S,0)(S,0)(H,2)(H,2)(C,1)(C,1)(S,0)(S,0)(H,2)(H,2)(C,1)(C,1)=(H,2)=(H,2)(C,1)(C,1)B B1 13 3=(S,1)(S,1)(H,2)(H,2)(H,2)(H,2)(H,2)(H,2)=(H,2)=(H,2)B B1 14 4=(S,1)(S,1)(H,2)(H,2)(E,2)(E,2)(C,1)(C,1)(H,2)(H,2)(E,2)(E,2)(C,1)(C,1)(H,2)(H,2)
11、(E,2)(E,2)(C,1)(C,1)=(H,2)=(H,2)(E,2)(E,2)(C,1)(C,1)B B1 15 5=(E,2)(E,2)(C,1)(C,1)(S,0)(S,0)(H,0)(H,0)(E,2)(E,2)(C,1)(C,1)(S,0)(S,0)(E,2)(E,2)(C,1)(C,1)=(E,2)=(E,2)(C,1)(C,1)(Energy=2)(Energy=2)(Temperature=1)(Temperature=1)(Current=1)(Current=1)(Temperature=1)(Temperature=1)(Height=2)(Height=2)(Tem
12、perature=1)(Temperature=1)TUT 2000/06/0721Variable Precision Rough Set Model Positive regionBoundary regionNegative regionTUT 2000/06/0722Variable Dependency AnalysisNecessary and Sufficient Variable SetsReduct 2CoreReduct 3Reduct 1Reduct 5Reduct 4TUT 2000/06/0723Cars exampleReducts(1)cyl,fuelsys,co
13、mp,power,weight(2)size,fuelsys,comp,power,weight(3)size,fuelsys,displace,weight(4)size,cyl,fuelsys,power,weight(5)cyl,turbo,fuelsys,displace,comp,trans,weight(6)size,cyl,fuelsys,comp,weight(7)size,cyl,turbo,fuelsys,trans,weightCore:fuelsys,weightZiarko:The discovery,analysis,and representation of da
14、ta dependencies in databases,Knowledge Discovery in Databases,pp.195-209,Piatetsky-Shapiro&Frawley ed.AAAI Press(1991).TUT 2000/06/0724Reduct&Core Effects to Sum of SquaresSize cyl turbo fuelsys displace comp power trans weightNet-power121086420VariablesTUT 2000/06/0725Rough Set Method as a Tool of
15、Data AnalysisnVery good rules for understandingDespiteuToo many reducts uNumber of reducts changes with confidence value in VPRSMuDisregard of frequenciesTUT 2000/06/0726Rough setnPawlak,Z.:Rough Sets:Theoretical Aspects of Reasoning about Data,Kluwer Academic Publishers(1991).nW.Ziarko:Review of Ba
16、sics of Rough Sets in the Context of Data Mining,Proc.Fourth International Workshop on Rough Sets,Fuzzy Sets,and Machine Discovery,pp.447-457,Tokyo(1996).nDatalogic/R:Reduct Systems Inc.n方法論特徴u離散表現対方法論離散表現対方法論u共起的分布知識獲得可能共起的分布知識獲得可能u計算量計算量数数N,属性数属性値数属性数属性値数exp(N)TUT 2000/06/0727Streams in learning r
17、esearchII.Characteristic Rules nEvaluation by UsefulnessnPatterns with Accuracy&SupportnStatistical estimation of generality and accuracy鈴木鈴木(1999):(1999):特徴的発見特徴的発見一般性正確性信頼性同時評価手法、一般性正確性信頼性同時評価手法、人工知能学会誌人工知能学会誌、14,139-147.,139-147.nExceptions as interestingness鈴木、志村鈴木、志村(1997):(1997):情報理論的手法用情報理論的手
18、法用例外的知識発見、例外的知識発見、人工知能学会誌人工知能学会誌、12,305-312.,305-312.nRating usefulness by human estimation Rule generation by Genetic AlgorithmTerano,T.and Ishino,Y.(1996):Interactive knowledge Terano,T.and Ishino,Y.(1996):Interactive knowledge discovery from marketing questionaire using simulated discovery from m
19、arketing questionaire using simulated breeding and inductive learning methods,breeding and inductive learning methods,Proc.KDD-Proc.KDD-9696,279-282.279-282.nMarket basket analysisTUT 2000/06/0728相関抽出相関抽出Association rules mining相関TUT 2000/06/0729Apriori algorithm候補集合候補集合TUT 2000/06/0730時系列解析時系列解析(1)
20、TUT 2000/06/0731時系列解析時系列解析(2)TUT 2000/06/0732分類階層構造導入分類階層構造導入飲料清涼飲料弱酒類強酒類TUT 2000/06/0733数値属性取扱数値属性取扱n離散化nMax-support越range統合 複数rangenFrequent itemset計算Rule導出nRule Interest 刈込nPartial completeness概念 Interval設定、健全性確保Srikant,R.&Agrawal,R.:Mining Quantitative Association Rules in Large Relational Table
21、s,Proc.ACM SIGMOD,pp.1-12(1996).PeopleFrequent itemset(part)TUT 2000/06/0734仮想導入要因分析仮想導入要因分析沼尾、清水沼尾、清水:流通業流通業,人工知能学会誌人工知能学会誌,Vol.12,No.4,pp.528-535(1997).TUT 2000/06/0735n時系列記号化認識時系列記号化認識n教師付帰納学習教師付帰納学習異常発生最大遅時間圧力圧力温度温度化化IF 圧力上昇圧力上昇AND 温度下降温度下降 THEN 異常発生:確率異常発生:確率80%佐藤:佐藤:向技術応用向技術応用,情報処理学会関西支部平成年度第回
22、研究会情報処理学会関西支部平成年度第回研究会時系列事例時系列事例TUT 2000/06/0736構造拡張:構造拡張:履歴分析履歴分析猪口他猪口他:人工知能学会基礎論研究会人工知能学会基礎論研究会 SIG-FAI-9801-10,pp.55-60(1998).TUT 2000/06/0737相関探索相関探索TUT 2000/06/0738抽出抽出Setp3:選択選択Graph Based Induction逐次拡張逐次拡張吉田、元田吉田、元田:逐次拡張基帰納推論逐次拡張基帰納推論 人工知能学会誌人工知能学会誌 Vol.12,pp.58-67(1997).入力入力Step1:入力書換入力書換Ste
23、p2:入力中数上入力中数上TUT 2000/06/0739GBI操作履歴解析応用操作履歴解析応用emacslprdvi2psxdvilatexpaper.pspaper.dvipaper.texcommandfileTUT 2000/06/0740Graph Based Induction特徴特徴n高速構造化解析可n概念獲得,分類規則学習,推論高速化何適用可能nSequence(DNA,protein)応用Negative条件表現工夫必要nOrdered Graph限定n規則概念連結限定n複製障害複雑取扱困難TUT 2000/06/0741帰納論理帰納論理最簡単実行例最簡単実行例n前提知識pa
24、rent(1,2).parent(1,3).n正例他負例grandparent(1,4).grandparent(1,5).n結果grandparent(X,Y):-parent(X,Z),parent(Z,Y).TUT 2000/06/0742Version space中仮説探索中仮説探索Grandparent(X,Y):-?n被覆集合n新付加変数定数化n記述長最少原理仮説選択nFOIL:Quinlan(1990)entropy最良探索nProgol:Muggleton(1995)逆伴意(Inverse entailment)探索空間縮小採用仮説採用仮説棄却仮説棄却仮説正例正例負例負例TUT
25、 2000/06/0743Progol変異原性物質識別変異原性物質識別n230種化合物:種化合物:Ames test positive 138/negative 92,Debnath et al:J.Med.Chem.34:786-797(1991).n種:重回帰分析実施種:重回帰分析実施nProgol:188(12hr)/42(6hr)分割分割解析解析natm(compound,atom,element,type,charge).bond(compound,atom1,atom2,bondtype).n9種種Rule 分類精度同様分類精度同様n指示変数自動的発見指示変数自動的発見Phenan
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 机械 学习
![提示](https://www.taowenge.com/images/bang_tan.gif)
限制150内