优秀大学统计学授课讲义statistics.ppt
《优秀大学统计学授课讲义statistics.ppt》由会员分享,可在线阅读,更多相关《优秀大学统计学授课讲义statistics.ppt(89页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、QuantitativeDataAnalysis:StatisticsSherlockHolmes.while man is an insoluble puzzle,in the aggregate he becomes a mathematical certainty.You can,for example,never foretell what any one man will do,but you can say with precision what an average number will be up to.Individuals vary,but percentages rem
2、ain constant.So says the statisticianOverviewnGeneralStatisticsnTheNormalDistributionnZ-TestsnConfidenceIntervalsnT-TestsGeneralStatisticsTHEGOLDENRULEStatisticsNEVERreplacethejudgmentoftheexpert.ApproachtoStatisticalResearch1.FormulateaHypothesis2.Statepredictionsofthehypothesis3.Performexperiments
3、orobservations4.Interpretexperimentsorobservations5.Evaluateresultswithrespecttohypothesis6.Refinehypothesisandstartagain(Basicallythesameasallotherresearch)HypothesisTestingH0:Null Hypothesis,statusquoHA:Alternative Hypothesis,researchquestionSo,either:The data does not support H0orWe fail to rejec
4、t H0TypesofDatanContinuousheight,age,timenDiscrete#ofdaysworkedthisweek,#leavesonatreenOrdinalGood,O.K.,BadnNominalYes/No,Teacher/Chemist/HaberdasherPicturingTheDataPieChartsnNominal/OrdinalnOnlysuitablefordatathataddsupto1nHardtocomparevaluesinthechartBarChartsnNominal/OrdinalnEasiertocomparevalues
5、thanpiechartnSuitableforawiderrangeofdataDotPlotsnNominal/OrdinalnRepresentsallthedatanDifficulttoreadBoxPlotsnNominal/Ordinaln1IQR,3IQRnOutliersScatterPlotsnExcellentforexaminingassociationbetweentwovariablesHistogramsnContinuousDatanDivideDataintorangesTime-SeriesPlotsnTimerelatedDatane.g.StockPri
6、cesQuestion1Inatelephonesurveyof68households,whenaskeddotheyhavepets,thefollowingweretheresponses:n16:NoPetsn28:Dogsn32:CatsDrawtheappropriategraphictoillustratetheresults!Question1-SolutionTotalnumbersurveyed=68Numberwithnopets=16=Totalwithpets=(68-16)=52Buttotal28dogs+32cats=60=Sosomepeoplehavebot
7、hcatsanddogsQuestion1-SolutionHowmany?Itmustbe(60-52)=8peopleNopets=16Dogs=20Cats=24Both=8-Total=68Question1-SolutionGraphic:PieChartorBarChartThe Literary DigestPolln1936USPresidentialElectionnAlfLandon(R)vs.FranklinD.Roosevelt(D)The Literary DigestPollnLiterary Digesthadbeenconductingsuccessfulpre
8、sidentialelectionpollssince1916nTheyhadcorrectlypredictedtheoutcomesofthe1916,1920,1924,1928,and1932electionsbyconductingpolls.nThesepollswerealucrativeventureforthemagazine:readerslikedthem;newspapersplayedthemup;andeach“ballot”includedasubscriptionblank.The Literary DigestPollnTheysentout10million
9、ballotstotwogroupsofpeople:prospectivesubscribers,“whowerechieflyupper-andmiddle-incomepeople”alistdesignedtocorrectforbiasfromthefirstlist,consistingofnamesselectedfromtelephonebooksandmotorvehicleregistriesThe Literary DigestPollnResponserate:approximately25%,or2,376,523responsesnResult:Landoninal
10、andslide(predicted57%ofthevote,Rooseveltpredicted40%)nElectionresult:Rooseveltreceivedapproximately60%ofthevoteThe Literary DigestPollnPOSSIBLE CAUSES OF ERRORnSelection Bias:Bytakingnamesandaddressesfromtelephonedirectories,surveysystematicallyexcludedpoorvoters.Republicansweremarkedlyoverrepresent
11、edin1936,Democratsdidnothaveasmanyphones,notaslikelytodrivecars,anddidnotreadtheLiterary Digest“SamplingFrame”istheactualpopulationofindividualsfromwhichasampleisdrawn:SelectionbiasresultswhensamplingframeisnotrepresentativeofthepopulationofinterestThe Literary DigestPollnPOSSIBLE CAUSES OF ERRORnNo
12、n-response Bias:Becauseonly20%of10millionpeoplereturnedsurveys,non-respondentsmayhavedifferentpreferencesfromrespondentsIndeed,respondentsfavoredLandonGreaterresponseratesreducetheoddsofbiasedsamplesTerminologynPopulation:isasetofentitiesconcerningwhichstatisticalinferencesaretobedrawn.nSample:anumb
13、erofindependentobservationsfromthesameprobabilitydistributionnParameter:thedistributionofarandomvariableasbelongingtoafamilyofprobabilitydistributions,distinguishedfromeachotherbythevaluesofafinitenumberofparametersnBias:afactorthatcausesastatisticalsampleofapopulationtohavesomeexamplesofthepopulati
14、onlessrepresentedthanothers.Outliers(andtheirtreatment)nAnoutlierisanobservationthatdoesnotfitthepatternintherestofthedatanCheckthedatanCheckwiththemeasurernIfreasontobelieveitisNOTreal,changeitifpossible,otherwiseleaveitout(butnote).nIfreasontobelieveitisreal,leaveitoutandnote.TheMeannTheMean(Arith
15、metic)nThemeanisdefinedasthesumofalltheelements,dividedbythenumberofelements.nThestatisticalmeanofasetofobservationsistheaverageofthemeasurementsinasetofdataTheVariancenButtherecanbealotofvarianceinindividualelements,e.g.teachersalariesAverage=22,000Lowest=12,000Difference=12,000-22,000=-10,000TheVa
16、riancenSumof(Sample-Average)=0,thusweneedtodefinevariance.nThevarianceofasetofdataisacumulativemeasureofthesquaresofthedifferenceofallthedatavaluesfromthemeandividedbysamplesizeminusone.StandardDeviationnThestandarddeviationofasetofdataisthepositive square rootofthevariance.-1-1Question2nFindthemean
17、andvarianceofthefollowingsamplevalues:36,41,43,44,46Question2nMean:(36+41+43+44+46)/5=42nVarianceDifferenceSquaren3642=-636n4142=-11n4342=11n4442=24n4642=416n-n58n58/(5-1)=58/4=14.5TheNormalDistributionDensityCurves:PropertiesTheNormalDistributionnThegraphhasasinglepeakatthecenter,thispeakoccursatth
18、emeannThegraphissymmetricalaboutthemeannThegraphnevertouchesthehorizontalaxisnTheareaunderthegraphisequalto1CharacterizationnAnormaldistributionisbell-shapedandsymmetric.nThedistributionisdeterminedbythemeanmu,m,m,andthestandarddeviationsigma,s s.nThemeanmucontrolsthecenterandsigmacontrolsthespread.
19、TheNormalDistributionnIfavariableisnormallydistributed,then:withinonestandarddeviationofthemeantherewillbeapproximately68%ofthedatawithintwostandarddeviationsofthemeantherewillbeapproximately95%ofthedatawithinthreestandarddeviationsofthemeantherewillbeapproximately99.7%ofthedataTheNormalDistribution
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 优秀 大学 统计学 授课 讲义 statistics
限制150内