n 多元回归估计与假设检验.pptx
1Parallels with Simple RegressionYi=b0+b1Xi1+b2Xi2+.bkXik+uib0 is still the interceptb1 to bk all called slope parameters,also called partial regression coefficients and any coefficient bj denote the change of Y with the changes of Xj as all the other independent variables fixed.u is still the error term(or disturbance)Still minimizing the sum of squared residuals,so have k+1 first order conditions第1页/共88页2Obtaining OLS Estimates第2页/共88页3Obtaining OLS Estimates,cont.The above estimated equation is called the OLS regression line or the sample regression function(SRF)the above equation is the estimated equation,is not the really equation.The really equation is population regression line which we dont know.We only estimate it.So,using a different sample,we can get another different estimated equation line.The population regression line is第3页/共88页4Interpreting Multiple Regression第4页/共88页5An Example(Wooldridge,p76)The determination of wage(dollars per hour),wage:Years of education,educYears of labor market experience,experYears with the current employer,tenureThe relationship btw.wage and educ,exper,tenure:wage=b0+b1educ+b2exper+b3tenure+uThe estimated equation as below:wage=educ+exper+tenure第5页/共88页6A“Partialling Out”Interpretation第6页/共88页7A“Partialling Out”Interpretation第7页/共88页8“Partialling Out”continuedPrevious equation implies that regressing Y on X1 and X2 gives same effect of X1 as regressing Y on residuals from a regression of X1 on X2This means only the part of Xi1 that is uncorrelated with Xi2 are being related to Yi,so were estimating the effect of X1 on Y after X2 has been“partialled out”第8页/共88页9The wage determinationsThe estimated equation as below:wage=-educ+exper+tenureNow,we first regress educ on exper and tenure to patial out the exper and tenures effects.Then we regress wage on the residuals of educ on exper and tenure.Whether we get the same result.?educ=exper+tenure denote residuals residwageresidWe can see that the coefficient of resid is the same of the coefficien of the variable educ in the first estimated equation.So is in the second equation.第9页/共88页10Goodness-of-Fit:R2第10页/共88页11Goodness-of-Fit第11页/共88页12Goodness-of-Fit(continued)How do we think about how well our sample regression line fits our sample data?Can compute the fraction of the total sum of squares(SST)that is explained by the model,call this the R-squared of regression R2=ESS/TSS=1 RSS/TSS第12页/共88页13Goodness-of-Fit(continued)第13页/共88页14More about R-squaredR2 can never decrease when another independent variable is added to a regression,and usually will increaseBecause R2 will usually increase with the number of independent variables,it is not a good way to compare models第14页/共88页15An ExampleUsing wage determination model to show that when we add another new independent variable will increase the value of R2.第15页/共88页16Adjusted R-SquaredR2 is simply an estimate of how much variation in y is explained by X1,X2,Xk.That is,Recall that the R2 will always increase as more variables are added to the modelThe adjusted R2 takes into account the number of variables in a model,and may decrease第16页/共88页17Adjusted R-Squared(cont)Most packages will give you both R2 and adj-R2 You can compare the fit of 2 models(with the same Y)by comparing the adj-R2wge=-+educ+exper adj-R2wge=-educ+tenure adj-R2 You cannot use the adj-R2 to compare models with different ys(e.g.y vs.ln(Y)wge=-+educ+exper adj-R2log(wge)=+educ+exper adj-R2Because the variance of the dependent variables is different,the comparation btw them make no sense.第17页/共88页18Assumptions for Unbiasedness第18页/共88页19Assumptions for UnbiasednessPopulation model is linear in parameters:Y=b0+b1X1+b2X2+bkXk+uWe can use a sample of size n,(Xi1,Xi2,Xik,Yi):i=1,2,n,from the population model,so that the sample model is Yi=b0+b1Xi1+b2Xi2+bkXik+ui Cov(uXi)=0,E(uXi)=0,i=1,2,n.E(u|X1,X2,Xk)=0,implying that all of the explanatory variables are exogenous.E(u|X)=0,where X=(X1,X2,Xk),which will reduce to E(u)=0 if independent variables X are not random variables.None of the Xs is constant,and there are no exact linear relationships among them.The new additional assumption.第19页/共88页20About multicollinearityIt does allow the independent variables to be correlated;they just cannot be perfectly linear correlated.Student performance:colGPA=b0+b1 hsGPA+b2ACT+b3 skipped+uConsumption function:consum=b0+b1inc+b2inc2+uBut,the following is invalid:log(consum)=b0+b1log(inc)+b2log(inc2)+uIn this case,we can not estimate the regression coefficients b1,b2.第20页/共88页21Unbiasedness of OLS estimationUnder the three assumptions above,we can get第21页/共88页22Too Many or Too Few Variables第22页/共88页23Too Many or Too Few VariablesWhat happens if we include variables in our specification that dont belong?There is no effect on our parameter estimate,and OLS remains unbiasedWhat if we exclude a variable from our specification that does belong?OLS will usually be biased 第23页/共88页24Omitted Variable Bias第24页/共88页25Omitted Variable Bias(cont)第25页/共88页26Omitted Variable Bias(cont)第26页/共88页27Omitted Variable Bias(cont)There are two cases where the estimated parameter is unbiased:If b2=0,so that X2 does not appear in the true modelIf tilde of d1=0,the tilde b1 is unbiased for b1 第27页/共88页28Summary of Direction of BiasCorr(X1,X2)0Corr(X1,X2)0Positive biasNegative biasb2 0 and H1:bj 0One-Sided Alternatives(cont)0ca(1-a)Fail to rejectreject第56页/共88页57An Example:Hourly Wage EquationWage determination:(wooldridge,p123)log(wgeeduc+exper+tenure (0.104)(0.007)(0.0017)(0.003)n=526 R2Whether the return to exper,controlling for educ and tenure,is zero in the population,against the alternative that it is positive.H0:bexper=0 vs.H1:bexper 0The t statistic is tThe degree of freedom:df=n-k-1=526-3-1=522That is,we will reject the null hypothesis and bexper is really positive.01.645(1-a)Fail to reject5%reject第57页/共88页58Another example:Student Performance and School SizeWhether the school size has effect on student performance?math10,math test scores,reveal the student performancetotcomp,average annual teacher compensationstaff,the number of staff per one thousand studentsenroll,student enrollment,reveal the school size.The Model Equationmath10=b0+b1totcomp+b2staff+b3enroll+uH0:b3=0,H1:b30The Estimated Equationmath10=2.274+0.00046 totcomp+0.048 staff-0.00020 enroll (6.113)(0.00010)(0.040)(0.00022)n=408,R2=df=408-3-1=404,t -0.91,c,so we cant reject the null hypothesis.-1.645reject-091第58页/共88页59One-sided vs Two-sidedBecause the t distribution is symmetric,testing H1:bj 0 is straightforward.The critical value is just the negative of beforeWe can reject the null if the t statistic c,then we fail to reject the nullFor a two-sided test,we set the critical value based on a/2 and reject H0:bj=0 if the absolute value of the t statistic c第59页/共88页60yi =b0 +b1Xi1 +bkXik+uiH0:bj=0 H1:bj 0c0a/2(1-a)-ca/2Two-Sided Alternativesrejectrejectfail to reject第60页/共88页61Summary for H0:b bj=0Unless otherwise stated,the alternative is assumed to be two-sidedIf we reject the null,we typically say“Xj is statistically significant at the 100a%level”If we fail to reject the null,we typically say“Xj is statistically insignificant at the 100a%level”第61页/共88页62An Example:Determinants of College GPA(wooldridge,p128)Variables:colGPA,college GPAskipped,the average number of lectures missed per weekACT,achievement test scorehsGPA,high school GPAThe estimated modelolGPA=1.39+0.412 hsGPA+0.015 ACT 0.083 skipped (0.33)(0.094)(0.011)(0.026)n=141,R2H0:bskipped=0,H1:bskipped 0fd:n-k-1=137,the critical value t137The t statistic is|-0.083/0.026|=3.19 t137=1.96,so we will reject the null hypothesis and the bskipped is signanificantly beyond zero.-1.96reject-3.191.96reject第62页/共88页63Testing other hypothesesA more general form of the t statistic recognizes that we may want to test something like H0:bj=aj In this case,the appropriate t statistic is第63页/共88页64An Example:Campus Crime and Enrollment(wooldridge,p129)lVariableslcrime,the annual number of crimes on college campuseslenroll,student enrollment,reveal the size of college.lThe regression modelllog(crime)=b0+b1log(enroll)+ulWhether b1=1,that is H0:b1=1,H1:b1 1llog(crime)=-6.63+1.27 log(enroll)l (1.03)(0.11)n=97 R2 ldf:n-k-1=95,the critical value at 5%is t95l2.45t95lSo we reject the null hypothesis and the evidence prove that b1 1.第64页/共88页65Confidence IntervalsAnother way to use classical statistical testing is to construct a confidence interval using the same critical value as was used for a two-sided test A 100(1-a)%confidence interval is defined as第65页/共88页66Computing p-values for t testsAn alternative to the classical approach is to ask,“what is the smallest significance level at which the null would be rejected?”So,compute the t statistic,and then look up what percentile it is in the appropriate t distribution this is the p-valuep-value is the probability we would observe the t statistic we did,if the null were true第66页/共88页67Stata and p-values,t tests,etc.Most computer packages will compute the p-value for you,assuming a two-sided testIf you really want a one-sided alternative,just divide the two-sided p-value by 2Stata provides the t statistic,p-value,and 95%confidence interval for H0:bj=0 for you,in columns labeled“t”,“P|t|”and“95%Conf.Interval”,respectively第67页/共88页68Testing a Linear CombinationSuppose instead of testing whether b1 is equal to a constant,you want to test if it is equal to another parameter,that is H0:b1=b2,or b1-b2=0Use same basic procedure for forming a t statistic第68页/共88页69Testing Linear Combination(cont)第69页/共88页70Testing a Linear Combo(cont)So,to use formula,need s12,which standard output does not haveMany packages will have an option to get it,or will just perform the test for youIn Stata,after reg Y X1 X2 Xk you would type test X1=X2 to get a p-value for the testMore generally,you can always restate the problem to get the test you want第70页/共88页71Example:Suppose you are interested in the effect of campaign expenditures on outcomesModel is voteA=b0+b1log(expendA)+b2log(expendB)+b3prtystrA+uH0:b1=-b2,or H0:q1=b1+b2=0b1=q1 b2,so substitute in and rearrange voteA=b0+q1log(expendA)+b2log(expendB)log(expendA)+b3prtystrA+u第71页/共88页72Example(cont):This is the same model as originally,but now you get a standard error for b1 b2=q1 directly from the basic regressionAny linear combination of parameters could be tested in a similar mannerOther examples of hypotheses about a single linear combination of parameters:b1=1+b2;b1=5b2;b1=-1/2b2;etc 第72页/共88页73Multiple Linear RestrictionsEverything weve done so far has involved testing a single linear restriction,(e.g.b1=0 or b1=b2)However,we may want to jointly test multiple hypotheses about our parametersA typical example is testing“exclusion restrictions”we want to know if a group of parameters are all equal to zero第73页/共88页74Testing Exclusion RestrictionsNow the null hypothesis might be something like H0:bk-q+1=0,.,bk=0The alternative is just H1:H0 is not trueCant just check each t statistic separately,because we want to know if the q parameters are jointly significant at a given level it is possible for none to be individually significant at that level第74页/共88页75Exclusion Restrictions(cont)To do the test we need to estimate the“restricted model”without Xk-q+1,Xk included,as well as the“unrestricted model”with all Xs includedIntuitively,we want to know if the change in RSS is big enough to warrant inclusion of Xk-q+1,Xk 第75页/共88页76The F statisticThe F statistic is always positive,since the RSS from the restricted model cant be less than the RSS from the unrestricted.Essentially the F statistic is measuring the relative increase in RSS when moving from the unrestricted to restricted model q=number of restrictions,or dfr dfur n k 1=dfur第76页/共88页77The F statistic(cont)To decide if the increase in RSS when we move to a restricted model is“big enough”to reject the exclusions,we need to know about the sampling distribution of our F statNot surprisingly,F Fq,n-k-1,where q is referred to as the numerator degrees of freedom and n k 1 as the denominator degrees of freedom 第77页/共88页780c(1-a)f(F)FThe F statistic(cont)rejectfail to rejectReject H0 at a significance level if F ca第78页/共88页79Example:the determinations of league baseball players salaries(wooldridge,p143)The regression modellog(salary)=b0+b1year+b2gamesyr+b3bavg+b4hrunsyr+b5rbisyr+usalary,the 1993 total salaryyears,years in the leaguegamesyr,average games played per yearbavg,career batting averagehrunsyr,home runs per yearrbisyr,runs battled in per yearThe null hypothesis is H0:b3=0,b4=0,b5=0,which is called multiple hypotheses test or joint hypotheses test.The alternative hypothesis is H1:H0 is not true.The unrestricted model:log(salary)=year+gamesyr+bavg+hrunsyr+rbisyr (0.29)(0.0689)(0.0026)(0.00110)(0.0161)(0.0072)n=353,SSR=183.186,R2The restricted modellog(salary)=year+gamesyr (0.11)(0.0125)(0.0013)n=353,SSR=198.311,R2第79页/共88页80Example:the determinations of league baseball players salaries,cont.The restricted number and the degree of the freedom of restricted model is q=3;The degree of freedom of unrestricted model is 353-5-1=347;Then the F statistic is第80页/共88页81The R2 form of the F statisticBecause the RSSs may be large and unwieldy,an alternative form of the formula is usefulWe use the fact that RSS=TSS(1 R2)for any regression,so can substitute in for RSSu and RSSur第81页/共88页82Example:Parents Education in a Birth Weight Equation(wooldridge,p150)Variablesbwght,birth weight in pounds;cigs,average number of cigarettes the mother smoked per day during pregnancy;parity,the birth order of this child;faminc,annual family income;motheduc,years of schooling for the mother;fatheduc,years of schooling for the father.Model:bwght=b0+b1cigs+b2parity+b3faminc+b4motheduc+b5fatheduc+uWhether the parents education has any effect on birth weight?This is stated as H0:b4=0,b5=0,so q=2.bwght=cigs+parity+faminc-motheduc+fatheduc (3.728)(0.110)(0.659)(0.037)(0.320)(0.283)n=1191 R2bwght=cigs+parity+faminc (1.656)(0.109)(0.658)(0.032)n=1191 R2F Statistic is F=(0.0387-0.0364)/2/(1-0.0387)/(1191-5-1)=1.42 F2,1185We fail to reject H0.In other words,motheduc and fatheduc are jointly insignificant in the birth weight equation第82页/共88页83Overall SignificanceA special case of exclusion restrictions is to test H0:b1=b2=bk=0Since the R2 from a model with only an intercept will be zero,the F statistic is simply第83页/共88页84General Linear Restrictions The basic form of the F statistic will work for any set of linear restrictions First estimate the unrestricted model and then estimate the restricted model In each case,make note of the RSS Imposing the restrictions can be tricky will likely have to redefine variables again第84页/共88页85Example:Use same voting model as beforeModel is voteA=b0+b1log(expendA)+b2log(expendB)+b3prtystrA+unow null is H0:b1=1,b3=0Substituting in the restrictions:voteA=b0+log(expendA)+b2log(expendB)+u,soUse voteA-log(expendA)=b0+b2log(expendB)+u as res