结构方程模型第二讲.ppt
SEM结构方程模型第二讲 Still waters run deep.流静水深流静水深,人静心深人静心深 Where there is life,there is hope。有生命必有希望。有生命必有希望1 1、模型与假设、模型与假设测量模型测量模型 结构模型结构模型 PAGE2STRUCTURALEQUATIONMODELINGSEM假设假设e 与h;d与x;z与x不相关;e,d,z两两不相关且均值为0;协方差矩阵:cov(x)=F;cov(z)=Y;cov(e)=Qe;cov(d)=QdPAGE3STRUCTURALEQUATIONMODELINGSEM2、模型的估计、模型的估计LISREL 基于协方差矩阵PLS(Partial Least Square)基于主成份SEM(LISREL;PLS):第二代数据分析技术Bagozzi and Fornell,1982PAGE4STRUCTURALEQUATIONMODELINGSEM2.1 协方差结构协方差结构观测变量(y,x)的协方差矩阵为一般结构S(q),所以也叫协方差结构分析。PAGE5STRUCTURALEQUATIONMODELINGSEM2.2 模型的识别模型的识别定义定义:如果S(q1)=S(q2)必有 q1=q2,则称结构方程模型为可识别可识别的(identified).考虑方程S(q)S,如果方程数小于参数个数,则必有参数不能由已知量表示出来,此时模型为不可识别不可识别的(under identified).(S是p维矩阵,方程个数有多少个?)PAGE6STRUCTURALEQUATIONMODELING如果一个参数不能不能由已知量表示出来,则称该参数是不可识别的不可识别的(under identified);如果一个参数能且只能能且只能由已知量的一个表达式表示,则称该参数是恰好识别的恰好识别的(just identified);如果一个参数可以由已知量的两个以上两个以上表达式表示,则称该参数是超识别的超识别的(over identified);如果至少有一个参数是超识别的,则模型是超识别的;如果至少有一个参数是不可识别的,则模型是不可识别的。PAGE7STRUCTURALEQUATIONMODELINGSEM因子模型识别法则因子模型识别法则PAGE8STRUCTURALEQUATIONMODELINGSEM2.3、参数估计、参数估计目的:目的:总体协方差矩阵S(q)与样本协方差矩阵S尽可能接近接近。拟合函数(fit function)F(S,S(q)1)非负;2)连续;3)F(S,S(q)0当且仅当S(q)=S。估计:估计:找出 使得拟合函数取得最小值。PAGE9STRUCTURALEQUATIONMODELING1、极大似然估计、极大似然估计(Maximum Likelihood,ML)假设观测误差是多元正态分布,则拟合函数为基本性质:基本性质:1)ML估计是渐近无偏的(asymptotically unbiased)2)ML估计是一致估计(consistent);3)ML估计是渐近有效的(asymptotically efficient);4)ML估计是渐近正态分布:PAGE10STRUCTURALEQUATIONMODELING5)ML估计的最优拟合值渐近卡方分布,即其中p*=p(p+1)/2,t为自由参数的个数。这个结果可以用于整个模型的检验。H0:S=S(q)。证明可参看(Bollen,1989)PAGE11STRUCTURALEQUATIONMODELING2、未加权最小二乘估计、未加权最小二乘估计(unweighted Least Squares,ULS)(意义:残差矩阵S-S(q)全部元素的平方和)ULS估计是一致估计,但是不是渐近有效的;没有尺度不变性;ULS估计的最优拟合值不是渐近卡方分布。PAGE12STRUCTURALEQUATIONMODELING3、广义最小二乘估计、广义最小二乘估计(Generalized Least Squares,GLS)(意义:残差矩阵S-S(q)全部元素的加权平方和,其权重为样本协方差矩阵的逆矩阵)可以证明:当误差假设为正态分布时,GLS估计与ML估计是渐近等价的。因此,GLS估计具有ML估计一样的渐近性质。PAGE13STRUCTURALEQUATIONMODELING4、广义加权最小二乘估计、广义加权最小二乘估计(Generally weighted Least Squares,WLS)其中:s为由样本协方差矩阵S的所有下半对角元素组成的向量,称为“拉直”向量,记为s=Vecs(S)=(s11,s21,s22,s31,s32,s33,spp)s=Vecs(S)=(s11,s21,s22,s31,s32,s33,spp)W为p*维正定矩阵,p*=p(p+1)/2。PAGE14STRUCTURALEQUATIONMODELING特别,1)取 W=SS,:Kronecker 乘积,则WLS 估计化为GLS;2)取 W=S(qML)S(qML),则WLS 估计化为ML;一般,Browne(1982,1984)建议W取wgh,ij=mghij sghsij其中wghij是4阶样本中心矩。这是一种渐近与分布无关的估计(asymptotically distribution-free,ADF),具有许多与ML估计相同的渐近性质。PAGE15STRUCTURALEQUATIONMODELINGSEM2.4、模型评价、模型评价目的:评价模型拟合的好坏。方法:拟合指数,对模型进行整体评价;测定系数,评价模型对数据的解释能力;参数检验,评价参数的显著性。PAGE16STRUCTURALEQUATIONMODELINGSEM2.4.1 拟合指数拟合指数拟合指数,也叫拟合优度统计量(Goodness-of-fit statistics),反映模型拟合好坏。1、Chi-Square(c2)2、Goodness-of-fit index(GFI)3、Adjusted Goodness-of-fit index(AGFI)4、Root mean square error of approximation(RMSEA)5、Standardized Root mean square residual(RMR)PAGE17STRUCTURALEQUATIONMODELING1、Chi-Square(c2)c2(n-1)F()。已经证明,对ML,GLS和 WLS估计,在一定条件下,渐近趋向于c2分布,自由度为(p*-t),p*=p(p+1)/2,t为自由参数的个数。判断:判断:c2越小,说明拟合越好。当卡方检验显著时(p-值0.1),模型拟合不好;如果不显著,模型可以接受。注1:对ML和GLS,数据输入为样本协方差时,c2正确;如果是相关系数矩阵,则只有模型具有度量不变性(scale-invariant)才能给出正确的c2值。对WLS,还需给出正确的权阵W才可以。PAGE18STRUCTURALEQUATIONMODELING2、Goodness-of-fit indices(GFI)Adjusted Goodness-of-fit index(AGFI)拟合优度指数 Joreskog&Sorbom(1981)给出:其中p*=p(p+1)/2,d为模型的自由度。一般认为GFI大于0.9时,拟合良好。PAGE19STRUCTURALEQUATIONMODELING3、Root mean square error of approximation(RMSEA)近似误差均方根(Steiger&Lind,1980)其中df 是卡方的自由度。c2-df 称为离中参数(Noncentrality parameter,NCP;Steiger,1980)。总体差距函数(Population Discrepancy Function,PDF):PAGE20STRUCTURALEQUATIONMODELING4、Standardized Root mean square residual(SRMR)标准化残差均方根注1:其他拟合指数参看侯杰泰(2004)注2:以上指数仅反映整个模型的拟合程度。整个模型拟合很好,不表示每个关系符合得也很好。PAGE21STRUCTURALEQUATIONMODELINGSEM2.4.2 测定系数测定系数类似于回归分析中的R2(Coefficient of Determinant)1)第i个方程的测定系数:其中 是第i个方程的残差的方差的估计值,是第i个变量的样本方差。方程的测定系数用于评价第i个方程对数据的解释能力。PAGE22STRUCTURALEQUATIONMODELING2)整个模型的测定系数其中|Y|是Y的行列式,|S|是S的行列式。计算时Y一般用的估计值,S用拟合的协方差矩阵或者样本协方差矩阵代替。测定系数在01之间,越大越好。测定系数与方程个数有关,因此,建议用于评价方程,评价总体模型还是拟合指数为优。PAGE23STRUCTURALEQUATIONMODELING模型修正指数模型修正指数(Modification Index,MI):通过对自由参数增加、减少、变动,引起的卡方的改变量。在LISREL中通过MI命令,每个固定参数都会给出一个修正指数,它等于当该参数设为自由参数时所减少的卡方值。PAGE24STRUCTURALEQUATIONMODELINGSEM由于每个参数都会给出标准误(standard error),因此可以对参数进行显著性检验。也就是检验参数是否为零参数是否为零。比如,检验结果两个潜在变量之间的系数不显著,就应该固定该参数为零,然后修正模型并重新估计。2.4.3 参数检验参数检验PAGE25STRUCTURALEQUATIONMODELINGSEM3、模型的另一种估计方法:模型的另一种估计方法:PLS(Partial Least Square)PLS,thesecondmajorSEMtechnique,isdesignedtoexplainvariance,i.e.,toexaminethesignificanceoftherelationshipsandtheirresultingR2,asinlinearregression.Consequently,PLSismoresuitedforpredictiveapplicationsandtheorybuilding,incontrasttocovariance-basedSEM.PAGE26STRUCTURALEQUATIONMODELINGSEMConditions when you mightconsider using PLSDo you work with theoretical models that involve latent constructs?Do you have multicollinearity problems with variables that tap into the same issues?Do you want to account for measurement error?Do you have non-normal data?PAGE27STRUCTURALEQUATIONMODELINGSEMDo you have a small sample set?Do you wish to determine whether the measures you developed are valid and reliable within the context of the theory you are working in?Do you have formative as well as reflective measures?Conditions when you mightconsider using PLS(Cont)PAGE28STRUCTURALEQUATIONMODELINGPAGE29STRUCTURALEQUATIONMODELINGSEMThe basic PLS algorithm forLatent variable path analysis Stage 1:Iterative estimation of weights and LV scores starting at step#4,repeating steps#1 to#4 until convergence is obtained.Stage 2:Estimation of paths and loading coefficients.Stage 3:Estimation of location parameters.PAGE30STRUCTURALEQUATIONMODELINGPAGE31STRUCTURALEQUATIONMODELINGSEMComputer SoftwaresLVPLSPLS-GUIPLS-Graph(Wynne W.Chin)PAGE32STRUCTURALEQUATIONMODELINGSEMConsiderations when choosingbetween PLS and LISREL ObjectivesTheoretical constructs-indeterminate vs.defined Epistemic relationshipsTheory requirementsEmpirical factorsComputational issues-identification&speedPAGE33STRUCTURALEQUATIONMODELINGSEMObjectives Prediction versus explanationPAGE34STRUCTURALEQUATIONMODELINGSEMTheoretical constructs-Indeterminate versus defined For PLS-the latent variables are estimatedas linear aggregates or components.Thelatent variable scores are estimated directly.If raw data is used,scoring coefficients areestimated.For LISREL-IndeterminacyPAGE35STRUCTURALEQUATIONMODELINGSEMEpistemic relationships Latent constructs with reflective indicators-LISREL&PLS Emergent constructs with formative indicators-PLS By choosing different weighting“modes”the model builder shifts the emphasis of the model from a structural causal explanation of the covariance matrix to a prediction/reconstruction forecast of the raw data matrixPAGE36STRUCTURALEQUATIONMODELINGSEMTheory requirements LISREL expects strong theory(confirmation mode)PLS is flexiblePAGE37STRUCTURALEQUATIONMODELINGSEMEmpirical factors Distributional assumptions PLS estimation is a“rigid”technique thatrequires only“soft”assumptions about the distributional characteristics of the raw data.LISREL requires more stringent conditions.PAGE38STRUCTURALEQUATIONMODELINGSEMEmpirical factors(continued)Sample Size depends on power analysis,butmuch smaller for PLS PLS heuristic of ten times the greater of thefollowing two(ideally use power analysis)-construct with the greatest number of formative indicators -construct with the greatest number of structural paths going into it LISREL heuristic-at least 200 cases or 10 times the number of parameters estimated.PAGE39STRUCTURALEQUATIONMODELINGSEMEmpirical factors(continued)Types of measuresPLS can use categorical through ratio measuresLISREL generally expects interval level,otherwise need PRELIS preprocessing.PAGE40STRUCTURALEQUATIONMODELINGSEMComputational issues-IdentificationAre estimates unique?Under recursive models-PLS is always identifiedLISREL-depends on the model.Ideally need 4 or more indicators per construct to be over determined,3 to be just identified.Algebraic proof for identification.PAGE41STRUCTURALEQUATIONMODELINGSEMComputational issues-SpeedPLS estimation is fast and avoids the problem of negative variance estimates(i.e.,Heywood cases)PLS needs less computing time and memory.The PLS-Graph program can handle up to 400 indicators.Models with 50 to 100 are estimated in a matter of seconds.PAGE42STRUCTURALEQUATIONMODELINGSEMPAGE43STRUCTURALEQUATIONMODELINGSEMPAGE44STRUCTURALEQUATIONMODELINGAny Questions?PAGE45STRUCTURALEQUATIONMODELING