Iris数据判别分析(共20页).docx
精选优质文档-倾情为你奉上Iris数据判别分析一、 提出问题R.A.Fisher在1936年发表的Iris数据中,研究某植物的萼片长、宽及花瓣长、宽。x1:萼片长,x2:萼片宽,x3:花瓣长,x4:花瓣宽。取自3个种类G1,G2,G3,每个种类50个样品,共150个样品。数据如下表所示。序号类别x1x2x3x41160331422364285622326528461543673156245363285115614634143736931512382622245159259324818101463610211261304614122602751161336530522014256253911153653055181635827511917368325923181513317519257284513203623454232137738672222263334716233673357252437630662125349254517261553513227367305223282703247142926432451530261284013311483116232359305118332552438113436325501935364325323361523414237149361413825430451539379386420401443213241367335721421503516643258264012441443013245377286720463632749184714732162482552644124925023331050372326028511483014352151381625336130491854148341925515030162561503212257361265614583642856215914330111601584012261151381946226731441463362284818641493014265151351426625630451567258274110681503416469146321427026029451571257263510721574415473150361427437730612375363345624763582751197725719421378372305816791543415480152421518137130592182364315518833603048188436329561885249243310862562742138725730421288155421428914931152903772669239136022501592154391749326629461394252273914952603445169615034152971441914298250203510992552437101002582739121011473213210214631152103369325723104262294313105374286119106259304215107151341521081503513310935628492011026022401011137329631811236725581811314931151114267314715115263234413116154371521172563041131182632549151192612847121202642943131212512530111222572841131233653058221243693154211251543913412615135143127372366125128365325120129261294714130256293613131269314915132364275319133368305521134255254013135148341621361483014113714523133138357255020139157381731401513815314125523401314226630441414326828481414415434172145151371541461523515214735828512414826730501714936333602515015337152(1) 进行Bayes判别,并用回代法与交叉确认法判别结果;(2) 计算每个样品属于每一类的后验概率;(3) 进行逐步判别,并用回代法与交叉确认法验证判别结果。二、 判别分析用距离判别法,假定总体 G1,G2,G3的协方差矩阵1=2=3=。计算各个总体之间的马氏平方距离d2(Gi,Gj)形成的矩阵,其中dij2=d2Gi,Gj=(xi-x(j)TS-1(x(i)-x(j)线性判别函数是W1x=2.364x1+1.834x2-1.524x3-1.521x4-78.767W2x=1.510x1+0.558x2+0.665x3+0.419x4-70.541W3x=1.167x1+0.320x2+1.417x3+1.747x4-101.5012.1 Bayes判别假定1=2=3=。先验概率按比例分配,即p1=p2=p3=50150=13求得的线性判别函数W1x,W2x,W3(x)中关于变量x1x4的系数以及常数项均与上面结果相同。广义平方距离函数dj2x=x-xjTSj-1x-xj-2lnpj,j=1,2,3后验概率PGjx=exp-0.5dj2xi=13exp-0.5di2x,j=1,2,3以下是SPSS软件判别分析结果。分析觀察值處理摘要未加權的觀察值N百分比有效150100.0已排除遺漏或超出範圍群組代碼0.0至少一個遺漏區別變數0.0遺漏或超出範圍群組代碼及至少一個遺漏區別變數0.0總計0.0總計150100.0群組統計資料类别平均數標準偏差有效的 N (listwise)未加權加權1x150.263.7955050.000x234.104.3395050.000x314.621.7375050.000x42.461.0545050.0002x159.365.1625050.000x227.503.3645050.000x342.604.6995050.000x413.261.9785050.0003x165.886.3595050.000x229.743.2255050.000x355.525.5195050.000x420.462.9365050.000總計x158.508.253150150.000x230.454.571150150.000x337.5817.653150150.000x412.067.718150150.000群組平均值的等式檢定Wilks' Lambda ()Fdf1df2顯著性x1.393113.3142147.000x2.63841.6762147.000x3.0591180.1612147.000x4.075902.5042147.000聯合組內矩陣ax1x2x3x4共變異x127.1599.78316.7094.225x29.78313.5145.6103.464x316.7095.61018.5194.571x44.2253.4644.5714.547相關x11.000.511.745.380x2.5111.000.355.442x3.745.3551.000.498x4.380.442.4981.000a. 共變異數矩陣具有 147 自由度。共變異數矩陣a类别x1x2x3x41x114.40010.9731.509.939x210.97318.8271.304.994x31.5091.3043.016.607x4.939.994.6071.1112x126.6439.00018.2905.578x29.00011.3168.3884.173x318.2908.38822.0827.310x45.5784.1737.3103.9113x140.4349.37630.3296.158x29.37610.4007.1385.224x330.3297.13830.4595.797x46.1585.2245.7978.621總計x168.104-3.050125.84951.862x2-3.05020.893-31.831-11.530x3125.849-31.831311.628131.066x451.862-11.530131.06659.574a. 共變異數矩陣總計具有 149 自由度。變數已輸入/已移除a,b,c,d步驟已輸入Wilks' Lambda ()統計資料df1df2df3確切 F統計資料df1df2顯著性1x3.05912147.0001180.1612147.000.0002x2.03922147.000297.9004292.000.0003x4.02732147.000243.5026290.000.0004x1.02542147.000191.1338288.000.000在每一個步驟中,輸入最小化整體 Wilks' Lambda 的變數。a. 步驟的數目上限為 8。b. 要輸入的局部 F 下限為 3.84。c. 要移除的局部 F 上限為 2.71。d. F 層次、容差或 VIN 不足,無法進行進一步計算。分析中的變數步驟允差要移除的 FWilks' Lambda ()1x31.0001180.1612x3.8741129.588.638x2.87437.484.0593x3.72941.949.043x2.78144.975.044x4.67129.889.0394x3.37944.010.040x2.64817.172.031x4.66022.391.033x1.3696.615.027不在分析中的變數步驟允差最低 允差要輸入的 FWilks' Lambda ()0x11.0001.000113.314.393x21.0001.00041.676.638x31.0001.0001180.161.059x41.0001.000902.504.0751x1.445.44532.824.040x2.874.87437.484.039x4.752.75223.296.0442x1.375.37512.776.033x4.671.67129.889.0273x1.369.3696.615.025Wilks' Lambda ()步驟變數數目Lambda ()df1df2df3確切 F統計資料df1df2顯著性11.059121471180.1612147.000.00022.03922147297.9004292.000.00033.02732147243.5026290.000.00044.02542147191.1338288.000.000分類處理摘要已處理150已排除遺漏或超出範圍群組代碼0至少一個遺漏識別變數0已在輸出中使用150群組的事前機率类别在前分析中使用的觀察值未加權加權1.3335050.0002.3335050.0003.3335050.000總計1.000150150.000Bayes判别(用回代法)的结果见下表。分類結果a类别預測的群組成員資格總計123原始計數150005020500503005050%1100.0.0.0100.02.0100.0.0100.03.0.0100.0100.0a. 100.0% 個原始分組觀察值已正確地分類。下表是Bayes判别(交叉确认法)的结果。分類函數係數类别123x12.3641.5101.167x21.834.558.320x3-1.524.6651.417x4-1.521.4191.747(常數)-78.767-70.541-101.501費雪 (Fisher) 線性區別函數分類結果a类别預測的群組成員資格總計123原始計數150005020482503014950%1100.0.0.0100.02.096.04.0100.03.02.098.0100.0a. 98.0% 個原始分組觀察值已正確地分類。2.2 逐步判别逐步判别的主要计算步骤如下:第一步:输入原始数据矩阵X=x111x112x11mx121x122x12mx1n11xg11xg21xgng1x1n12xg12xg22xgng2x1n1mxg1mxg2mxgngm第二步:计算变量的总均值、组均值、总离差、组内离差。Xk=xk1,xk2,xkm,k=1,2,mX=x.1,x.2,x.mW=Wjlm×mT=(tjl)m×m第三步:给定挑选变量F检验门坎值(临界值)F1,F2。第四步:逐步挑选变量。逐步挑选变量的思想与逐步回归中一样,现假设迭代已进行了S步,引进了r个变量,这r个变量号构成的集合为Ir,剩下的m-r个变量号构成的集合为Im-r。第五步:求判别函数。设迭代h步后,挑选变量结束,共选入r个变量进入判别式。FkX=lnqk+Cok+jIrCjkxj,k=1,2,gCjk=n-gjIrxkiWijh,k=1,2,gCok=-12jIrCjkxki,k=1,2,g其中,qk为第k个总体的先验概率。判别系数的计算为Cjk=n-gjIrxkiWijh,k=1,2,gCok=-12jIrCjkxki,k=1,2,g其中,xki表示为k个总体的第i个变量的均值。第六步:判别归类。将已知样本进行回判,并算出错判概率,然后将待判样本进行归类。得到结果如下表:逐觀察值統計資料個案編號實際群組最高群組第二高群組區別評分預測的群組P(D>d | G=g)P(G=g | D=d)重心的馬氏 (Mahalanobis) 距離平方群組P(G=g | D=d)重心的馬氏 (Mahalanobis) 距離平方函數 1函數 2pdf原始111.58321.0001.0782.000102.251-8.352.071233.68021.000.7712.00024.2046.471.577322.7822.996.4913.00411.3692.354-.416433.34521.0002.1292.00027.3876.3201.779532*.1412.7303.9223.2705.9113.691-.998611.91221.000.1842.00076.125-6.926.377733.2092.9993.1272.00116.8394.7372.059822.2872.9772.5003.0239.9633.132-1.460923*.1312.7604.0632.2406.3713.625.9351011.47821.0001.4742.000103.912-8.335.8911122.8322.997.3693.00312.1112.237-.3991223*.1622.8323.6382.1686.8414.337-.9211333.6552.995.8462.00511.3154.722.8021422.54421.0001.2193.00025.639.960-1.5241533.6452.992.8772.00810.5444.921-.1371633.8122.998.4162.00212.9595.261-.0391733.44921.0001.5992.00027.5486.5501.3421811.44321.0001.6272.00062.661-6.086.5281922.7792.998.4993.00212.7022.375-1.0152033.24321.0002.8332.00024.4305.7142.1922133.42121.0001.7282.00028.0116.5801.3842222.2932.9792.4523.02110.1812.409.6792333.11821.0004.2672.00032.7306.5532.3432433.20721.0003.1472.00029.7687.168-.3082533.3282.9352.2322.0657.5734.468-.4672611.56421.0001.1462.000103.265-8.360.4892733.45021.0001.5982.00018.6635.2751.7352822.73321.000.6213.00017.9901.388-.0212922.5122.9991.3393.00114.9401.731.4263022.64721.000.8723.00023.337.853-.4073111.48821.0001.4332.00068.257-6.533-.6803233.5612.9801.1572.0208.9284.558.2293322.49921.0001.3903.00026.151.944-1.6123433.5552.9861.1772.0149.7384.809-.2353533.35221.0002.0892.00021.7635.5411.9573611.91921.000.1692.00090.235-7.729.1533711.82721.000.3802.00094.309-7.940.1833822.4052.9471.8093.0537.5952.904-.0773933.7642.999.5392.00115.6275.2111.1384011.85721.000.3092.00075.989-6.973-.2124133.82521.000.3852.00018.2245.5691.1324211.30321.0002.3882.00067.317-6.2201.3014322.77421.000.5113.00022.6201.129-1.1214411.60721.000.9982.00071.573-6.730-.5844533.06521.0005.4522.00031.1877.302-1.0814633.2742.8492.5872.1516.0364.090-.0544711.57521.0001.1082.00068.776-6.562-.5074822.5712.9991.1213.00115.1582.328-1.6044922.19121.0003.3163.00034.340.133-1.6105033.00921.0009.3902.00047.3777.6392.7965111.67021.000.8012.00069.948-6.626-.3415211.85621.000.3102.00090.123-7.662.6615333.2422.8452.8372.1556.2293.912.4785411.42721.0001.7022.00063.049-6.207-.4555511.43921.0001.6482.00069.334-6.598-.8395611.88321.000.2492.00090.552-7.761-.0235733.0182.9317.9852.06913.1815.024-2.2555833.79721.000.4532.00021.2306.225.2755911.66921.000.8042.00082.308-7.344-.6786011.04021.0006.4552.000130.836-9.4761.5696111.54221.0001.2232.00070.142-6.479.9336222.58621.0001.0693.00020.4121.098.0876333.2032.7463.1922.2545.3473.831.2306411.65421.000.8502.00075.585-6.965-.6306511.90521.000.2002.00091.111-7.758.3266622.4862.9731.4423.0278.6002.718-.0516722.40421.0001.8123.00028.207.745-1.6516811.73321.000.6222.00070.758-6.591.5106911.80921.000.4232.00075.140-6.929-.2977022.6552.991.8453.00910.2722.467-.1857122.09021.0004.8253.00039.077-.421-1.1837211.02521.0007.3922.000119.382-8.6862.5727311.86421.000.2922.00092.048-7.786.4997433.67121.000.7992.00024.5306.417.8667533.15921.0003.6772.00030.2506.3272.2857633.8122.998.4162.00212.9595.261-.0397722.1542.9983.7383.00216.1632.778-2.3557833.1732.8843.5112.1167.5714.468-.9847911.89421.000.2232.00081.567-7.193.6738011.22821.0002.9562.000112.943-8.7171.2278133.90821.000.1932.00019.7246.022.4048233.6972.993.7222.00710.5984.892.0368333.1902.7843.3202.2165.8953.774.5768433.6172.999.9652.00113.9845.458-.4608522.20921.0003.1343.00033.948.105-1.4378622.97521.000.0513.00015.6021.899-.8808722.86221.000.2983.00020.6201.196-.6128811.08721.0004.8762.000120.483-8.9801.6808911.66421.000.8192.00073.727-6.856-.5559033.00321.00011.5482.00049.2118.743-.7679133.0202.6717.7982.3299.2234.468-2.0429211.50921.0001.3492.00086.699-7.3401.3819322.99321.000.0153.00017.3681.648-.8239422.9052.999.1993.00