《各种对数线性模型.docx》由会员分享,可在线阅读,更多相关《各种对数线性模型.docx(21页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、第九章对数线性模型第一节Genera过程9. 1. 1主要功能9. 1. 2实例操作第二节 Hierarchical 过程9. 2. 1主要功能实例操作第三节Logit过程9. 3. 1主要功能实例操作对数线性模型是用于离散型数据或整理成列联表格式的计数资料的统计分析工具。在对 数线性模型中,所有用作的分类的因素均为独立变量,列联表各单元中的例数为应变量。对 于列联表资料,通常作X 2检验,但乂2检验无法系统地评价变量间的联系,也无法估计变量 间相互作用的大小,而对数线性模型是处理这些问题的最正确方法。第一节General过程9. 1. 1主要功能调用该过程可对一个或多个二维列联表资料进行非层
2、次对数线性分析。它只能拟合全饱 和模型,即分类变量各自效应及其相互间效应均包含在对数线性模型中。返回全书目录实例操作2.1674922915.101301.65335-.03106.366053-.0169870288.07921-.21447-.17223.138264.0577506145.055571.03925-.05117.166675-.0069187948.06504-.10637-.13440.120576-.0817851831.05570-1.46819-.19097.02740AGE*SEXParameterCoeff.Std. Err.Z-ValueLower 95 C
3、IUpper 95 CI1.7059980126.0484814.56319.61098.801022-.2968871102.03276-9.06301-.36109-.23268AGE*YEARParameterCoeff.Std. Err.Z-ValueLower 95 CIUpper 95 CI1 -,1762097434.08417-2.09344-.34119-.011232-.3051792054.10130-3.01249-.50374-.106623.1339590237.079211.69127-.02129.289204.1990874838.055573.58269.0
4、9017.308005.1982170140.065043.04744.07073.325706-.1646071030.05570-2.95499-.27379-.05543SEX*YEARParameterCoeff.Std. Err.Z-ValueLower 95 CIUpper 95 CI1.0471962901.04918.95960-.04920.143602-,0778801067.05818-1.33868-.19191.036153.0827715134.047341.74836-.01002.17556AGEParameterCoeff.Std. Err.Z-ValueLo
5、wer 95 CIUpper 95 CI1-.7212868272.04848-14.87857-.81630-.626272.7999110228.0327624.41872.73571.86412SEXParameterCoeff.Std. Err.Z-ValueLower 95 CIUpper 95 CI1-.0348756276.02856-1.22099-.09086.02111YEARParameterCoeff.Std. Err.Z-ValueLower 95 CIUpper 95 CI1 -.0205234390.04918-.41728-.11692.075882-.3188
6、195595.05818-5.48020-.43285-.204793-.0126524013.04734-.26725-.10544.08014系统开始对全饱和模型进行从高阶到低阶的效应项剔除。第一步,剔除3阶交互效应项 (AGE*SEX*YEAR)导致x 2值为&615,概率为0.1964 (不小于默认判据0.05),故该效应项被剔除。第二步,剔除2阶交互效应项,概率均小于0.05,故2阶交互效应项不能剔除。即本例 用2阶交互效应项(同时含1阶主效应项)描述模型已为最正确。Backward Elimination (p = .050) for DESIGN 1 with generatin
7、g classAGE*SEX*YEARLikelihood ratio chi square =.00000 DF = 0 P = 1.000If Deleted Simple Effect is AGE*SEX*YEARDF L.R. Chisq Change Prob68.615.1964Iter 4Step 1The best model has generating classAGE*SEXAGE*YEARSEX*YEARLikelihood ratio chi square =8.61546 DF = 6 P = .196If Deleted Simple Effect isDFL.
8、R. Chisq ChangeProbIterAGE*SEX2310.816.00002AGE*YEAR662.829.00002SEX*YEAR313.024.00462Step 2The best model has generating classAGE*SEXAGE*YEARSEX*YEARLikelihood ratio chi square =8.61546DF = 6 P= .196The final model has generating classAGE*SEXAGE*YEARSEX*YEARThe Iterative Proportional Fit algorithm
9、converged at iteration 0.The maximum difference between observed and fitted marginal totals is .131 and the convergence criterion is .278由于剔除了 3阶交互效应项,故原全饱和模型变为层次模型,因而期望例数改变,期望 例数与实际例数不同,进而残差、标准化残差均不为0。假设标准化残差界于J961.96范围 内,那么表示模型是恰当的。从下面的结果可知,本例的标准化残差均在-1.96L96范围内, 故层次模型是适合的。Observed, Expected Frequ
10、encies and Residuals.FactorCodeOBS countEXP countResidualStd ResidAGE1SEX1YEAR155.059.0-4.05-.53YEAR243.039.13.88.62YEAR389.088.3.69.07YEAR4140.0140.5-.50-.04SEX2YEAR117.013.04.041.12YEAR29.012.9-3.88-1.08YEAR320.020.7-.70-.15YEAR441.040.5.53.08AGE2SEX1YEAR1165.0163.01.99.16YEAR2101.097.93.07.31YEAR
11、3104.0112.6-8.62-.81YEAR4137.0133.53.54.31SEX2YEAR1260.0262.0-1.99-.12YEAR2233.0236.1-3.07-.20YEAR3202.0193.48.62.62YEAR4278.0281.6-3.55-.21AGE3SEX1YEAR150.047.92.06.30YEAR229.036.0-6.95-1.16YEAR356.048.17.921.14YEAR454.057.0-3.03-.40SEX2YEAR194.096.1-2.05-.21YEAR2115.0108.06.95.67YEAR395.0102.9-7.9
12、2-.78YEAR4153.0150.03.02.25Goodness-of-fit test statisticsLikelihood ratio chi square =8.61546DF = 6P= .196Pearson chi square =8.54688DF = 6P= .201返回目录i返回目录i返回全书目录第三节Logit过程9.3. 1主要功能调用此过程可完成对一个应变量与一个或多个自变量之间对数线性模型的拟合。如果分 类变量未区分应变量和自变量,那么应采用本章第一、二节介绍的方法;如果应变量是二分 计量,自变量是连续计量,那么应采用Logistic回归方法(详见第八章)。
13、返回目录返回全书目录9. 3. 2实例操作例9.3在艾滋病(AIDS)相关的知识、观念、行为研究(KAB Study)中,获得了不 同年龄和受教育水平的公众,对预防AIDS知识掌握程度的资料,经整理成列联表如下所示。 很明显,对预防AIDS知识的掌握程度与公众的年龄和受教育水平有关,即假设预防AIDS知 识掌握程度为应变量,那么应该受到年龄和受教育水平两个自变量的影响。下面将运用带应变 量的对数线性模型进行分析。9. 3. 2.1数据准备激活数据管理窗口,定义变量名:实际观察频数的变量名为freq;预防AIDS知识掌 握程度变量名为aids,按好、一般、差分别输入1、2、3;受教育水平变量名为
14、educ,按 高、中、低分别输入1、2、3;年龄变量名为age, 20-至50-依次输入14。输入原始数据 后选Data菜单的Weight Cases.项,在Weight Cases对话框中激活Weight cases by项, 从变量列表中选freq点击 钮使之进入Frequency Variable框,点击OK钮即可。统计分析激活 Statistics 菜单项选择 Loglinear 中的 Logit项,弹出 Logit Loglinear Analysis 对话框(图 9.5) o从对话框左侧的变量列表中选aids,点击A钮使之进入D叩endent框,点击Define Range.钮,弹
15、出 Logit Loglinear Analysis: Define Range 对话框,定义应变量 aids 的范围,在 Minimum 处键入 1,在 Maximum 处键入 3,点击 Continue 钮返回 Logit Loglinear Analysis 对话框。从对话框左侧的变量列表中选age,点击 钮使之进入Factor(s)框,点击Define Range钮,定义自变量age的范围为1、4;同法将自变量educ选入Factor框,并定义其 范围为1、3o本例要求计算各变量主效应和交互作用的参数估计,故点击Contrast钮,弹 出 Logit Loglinear Analysi
16、s:Contrasts 对话框,选择 Display parameter estimates 项,点击 Continue钮返回Logit Loglinear Analysis对话框,最后点击OK钮即完成分析。9. 3, 2. 3结果解释在结果输出窗口中将看到如下统计数据:系统显示共有1858个观察例数进入分析,分析涉及三个变量,其中AIDS为3水平, AGE为4水平,EDUC为3水平。将产生3阶4类效应,即:预防AIDS知识掌握程度主 效应(因AIDS被定义为应变量,故不再分析子变量AGE、EDUC的主效应),预防AIDS 知识掌握程度分别与年龄、受教育程度的交互效应,预防AIDS知识掌握程度
17、、年龄、受教 育程度三者的交互效应。之后系统显示实际例数、期望例数、残差、标准化残差和校正残差。DATA Information36 unweighted cases accepted.0 cases rejected because of out-of-range factor values.0 cases rejected because of missing data.1858 weighted cases will be used in the analysis.FACTOR InformationFactorLevelLabelAIDS3AGE4EDUC3DESIGN Informa
18、tion1 Design/Model will be processed.Correspondence Between Effects and Columns of Design/Model 1StartingEndingColumnColumnEffect Name12AIDS38AIDS BY AGE912AIDS BY EDUC1324AIDS BY AGE BY EDUCNote: for saturated models .500 has been added to all observed cells.This value may be changed by using the C
19、RITERIA = DELTA subcommand.* ML converged at iteration 2.Maximum difference between successive iterations =.00000.Observed, Expected Frequencies and ResidualsFactorCodeOBS. count & PCT.EXP. count & PCT.ResidualStd. Resid.Adj.Resid.AIDS1AGE1EDUC153.50 (55.44)53.50 (55.44).0000.0000.0000EDUC267.50 (34
20、.53)67.50 (34.53).0000.0000.0000EDUC32.50(10.64)2.50(10.64).0000.0000.0000AGE2EDUC128.50 (53.27)28.50 (53.27).0000.0000.0000EDUC271.50 (22.73)71.50 (22.73).0000.0000.0000EDUC316.50 (28.21)16.50 (28.21).0000.0000.0000AGE3EDUC131.50 (43.45)31.50 (43.45).0000.0000.0000EDUC238.50(17.46)38.50(17.46).0000
21、.0000EDUC38.50( 2.40)8.50 (2.40).0000.0000AGE4EDUC119.50 (52.00)19.50 (52.00).0000.0000EDUC29.50 (4.47)9.50 (4.47).0000.0000EDUC33.50( 1.48)3.50( 1.48).0000.0000AIDS2AGE1EDUC140.50 (41.97)40.50 (41.97).0000.0000EDUC2103.50 (52.94)103.50 (52.94).0000.0000EDUC33.50(14.89)3.50(14.89).0000.0000AGE2EDUC1
22、21.50 (40.19)21.50 (40.19).0000.0000EDUC2141.50 (44.99)141.50 (44.99).0000.0000EDUC322.50 (38.46)22.50 (38.46).0000.0000AGE3EDUC132.50 (44.83)32.50 (44.83).0000.0000EDUC294.50 (42.86)94.50 (42.86).0000.0000EDUC398.50 (27.79)98.50 (27.79).0000.0000AGE4EDUC16.50(17.33)6.50(17.33).0000.0000EDUC266.50 (
23、31.29)66.50 (31.29).0000.0000EDUC376.50 (32.35)76.50 (32.35).0000.0000AIDS3AGE1EDUC12.50( 2.59)2.50 (2.59).0000.0000EDUC224.50(12.53)24.50(12.53).0000.0000EDUC317.50 (74.47)17.50 (74.47).0000.0000AGE2EDUC13.50 (6.54)3.50 ( 6.54).0000.0000EDUC2101.50 (32.27)101.50 (32.27).0000.0000EDUC319.50 (33.33)1
24、9.50 (33.33).0000.0000AGE3EDUC18.50(11.72)8.50(11.72).0000.0000EDUC287.50 (39.68)87.50 (39.68).0000.0000EDUC3247.50 (69.82)247.50 (69.82).0000.0000AGE4EDUC111.50 (30.67)11.50 (30.67).0000.0000EDUC2136.50 (64.24)136.50 (64.24).0000.0000EDUC3156.50 (66.17)156.50 (66.17).0000.0000Goodness-of-Fit test s
25、tatisticsLikelihood Ratio Chi Square =.00000 DF = 0P= 1.000Pearson Chi Square 二.00000 DF = 0P = 1.000.0000.0000.0000.0000.0000.0000 .0000 .0000.0000 .0000 .0000.0000 .0000 .0000.0000 .0000 .0000.0000 .0000 .0000.0000 .0000 .0000.0000 .0000 .0000.0000 .0000 .0000下一段为拟合优度的检验。系统采用分散相似测量法(Dispersion Sim
26、ilarity Measure), 测量值界于-1至+1之间,愈靠近|1|,拟合优度愈好。本例为0.145879。Analysis of DispersionSource of VariationDue to ModelDue to ResidualTotalEntropy314.8751642.4911957.365DispersionConcentration DF173.2061014.1191187.3253750Measures of AssociationEntropy =.160867Concentration 二.145879最后,系统输出对数线性模型的各效应参数值。由于内容较
27、多,具体推算过程不再赘述(参阅本章第一节)。此处以AIDS主效应和AIDS与EDUC交互效应为例,演示如下:预防AIDS知识掌握程度主效应局部,参数为入 AIDS-好=-0.378234829入 AIDS一般=0.3307195684入 AIDS-差 二 0- (-0.378234829) - 0.3307195684 = 0.0475152606 这说明公众预防AIDS知识掌握程度一般。预防AIDS知识掌握程度与受教育水平交互效应局部,参数为入 AIDS 好-EDUC 高=1.097077448X AIDS 好-EDUC 中=-0.186500026入 AIDS 好一EDUC 彳氐=0 -
28、1.097077448 - (-0.186500026) = -0.910577422入 AIDS 一般-EDUC 高=-0.018774593X AIDS 一般-EDUC 中 = 0.0930200827入 AIDS 一般-EDUC 彳氐=0 一 (-0.018774593) - 0.0930200827 = 0.0742454897入 AIDS 差-EDUC 高= 0 - L097077448 -(-0.018774593)= 1.078302855入 AIDS 差-EDUC=0- 钮使之进入Factor(s)框,点击 Define Range.钮,弹出 General Loglinear
29、 Analysis: Define Range 对话框,定义分类变量 care 的范围,本例为1、2,故可在Minimum处键入1,在Maximum处键入2,点击Continue 钮返回General Loglinear Analysis对话框。同法将变量educ选入Factor(s)框,并定义其范围 为1、3。本例要求计算各分类变量主效应和交互作用的参数估计,故点击Contrast钮,弹 出 General Loglinear Analysis:Contrasts 对话框,选择 Display parameter estimates 项,点击 Continue钮返回General Logli
30、near Analysis对话框,最后点击0K钮即完成分析。9. 1.2. 3结果解释在结果输出窗口中将看到如下统计数据:首先显示系统对403例资料进行分析,共有二个分类变量:CARE为2水平,EDUC为 3水平。分析的效应有三类:满意程度(CARE)、教育程度(EDUC)和两者的交互作用 (CARE BY EDUC)。系统经2次叠代后即到达相邻二次估计之差不大于规定的0.001。DATA Informationunweighted cases accepted.0 cases rejected because of out-of-range factor values.0 cases rej
31、ected because of missing data.403 weighted cases will be used in the analysis.FACTOR InformationFactorLevelLabelCARE2EDUC3DESIGN InformationDesign/Model will be processed.Correspondence Between Effects and Columns of Design/Model 1StartingEndingColumnColumnEffect Name11CARE23EDUC45CARE BY EDUCNote:
32、for saturated models .500 has been added to all observed cells.This value may be changed by using the CRITERIA = DELTA subcommand.* ML converged at iteration 2.Maximum difference between successive iterations = .00000由于本例对Model (模型)未作定义,故系统采用默认的全饱和模型,因而期望例数 (EXP.count)与实际例数(OBS. count)相同,进而残差(Residu
33、al) 标准化残差(Std.Resid) 和校正残差(Adj.Resid)均为0。Observed, Expected Frequencies and ResidualsFactorCodeOBS. count & PCT.EXP. count & PCT.ResidualStd. Resid.Adj. Resid.CARE1EDUC165.50(16.13)65.50(16.13).0000.0000.0000EDUC2272.50 (67.12)272.50 (67.12).0000.0000.0000EDUC341.50(10.22)41.50(10.22).0000.0000.0000CARE2EDUC16.50( 1.60)6.50( 1.60).0000.0000.0000EDUC218.50(4.56)18.50(4.56).0000.0000.0000EDUC31.50(.37)1.50(.37).0000.0000.0