《数据分析与Stata软件应用(微课版)-上机实训参考答案第5章 数据内部关联结构分析与Stata实现.docx》由会员分享,可在线阅读,更多相关《数据分析与Stata软件应用(微课版)-上机实训参考答案第5章 数据内部关联结构分析与Stata实现.docx(9页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、上机实训参考答案1.搜集整理得到2018年黄河中下游流域31个主要流经城市的10个水资源承载力的 相关数据,分别为人口密度(xl,人/平方公里)、城镇化率(x2, %)、人均GDP (x3, 元)、第一产业用水量(x4,亿立方米)、第二产业用水量(x5,亿立方米)、水资源总 量(x6,亿立方米)、年降水量(x7,亿立方米)、造林面积(x8,公顷)、污水处理率 (x9, %)、建成区绿化覆盖率(X10, %),数据详情如表5-14所示。表5-14实训数据cityX1x2x3x4x5x6x7x8x9xlO呼和浩 特892369.801047196.461.2710.5578.651800099.5
2、140.31包头218883.571381686.402.869.60104.643070095.8344.50鄂尔多 斯264174.4921710710.612.8631.15339.086080099.874239乌海838695.00103248.73.82.293.6060098.1043.00巴彦淖尔476354.865473947.97.9110.34141.553400098.7536.27太原373084.88882721.712.876.0235.841727894.7144.67晋中125155.37578194.471.2312.6477.852406597.7437.
3、21座城641250.202822911.971.199.8170.851188795.0137.21忻州169450.95312094.57.8922.05143.794088995.6938.03临汾1028352.54320665.59.189.1295.9534073100.0039.36吕梁555850.59365853.28.9918.77119.3810754094.6640.58渭南206648.502137411.011.178.1755.593424290.5139.39延安671462.31665931.08.7811.50223.336897292.8740.76榆林3
4、52658.941002674.832.2924.32245.144981694.0436.24郑州1093773.401013524.235.277.2142.63427098.0540.83开封532148.80439339.912.269.2735.89767095.7138.39洛阳712057.60677074.895.3718.83106.861055099.3140.71新乡564053.404369612.762.5010.5550.95839093.1040.10焦作572159.40663298.113.337.3123.41631098.8941.02濮阳398245.3
5、0456448.162.825.6925.59238095.5440.59三门峡677056.30672751.361.3811.0863.461624096.7036.47济源392362.40877611.14.702.6413.91394098.7141.93济南249472.101063027.571.8819.4564.42490298.4340.73淄博258871.491077204.643.3917.2254.46283797.5245.22东营65769.041919425.632.2218.5479.84604797.3141.96济宁178558.855897215.85
6、2.4725.7685.291104197.4741.49泰安174861.87647146.441.8117.4863.53567897.0145.05德州178457.015825216.561.6315.4466.921531197.3842.50聊城207051.775193514.493.0313.0356.721414697.0042.14滨州111057.646740512.413.0317.1171.131872297.5044.15荷泽206550.253518417.631.6321.6881.04786997.2640.53数据来源:内蒙古、陕西、山西、河南、山东5省级行
7、政区的水资源公报及统计年鉴。请进行以下分析。将数据导入Stata软件形成名为xiti4.dta的数据文件。文件(F)S(E)视图(V)SD)iam日巴日用聃昭QTcityfi哼和浩特 Icityxlx2x3x4xSx6x7x8*克量囹口41L_呼河港将189239.8104719.461.2710.SB78.651600C过滤兖量2包头218883.5713B1686.42.869.6104.643070C回名称3尖尔多斯264174.4921710710.612.8631.15339.08080C10 cityII4乌电8389s 103248.73.82.293.60CS三分深尔4763&
8、4.S66473947.97.9110.34141.8S3400C回x2E6太原373084.888B2721.712.876.0236.841727S1754s125155.37578194.471.2312.6477.85240651 g x4B之城6412SO.22822911.971.199.8170.8511887LI回x59忻州194&0.9&312094.87.8922.OS143.79408BS回x610临汾10283S2.6432Q666.69.189.129S.9S3407Si-j0x711昌梁SS5850.5936S8S3.2B.9918.77119.3810754C|
9、* ,12泪由206648.52137411.011.178.1755.593424228SC快翌13延安671462.3165931.0B.78U.S 223.33B972REI* y&n11网压困LU14摘林3626S8.941002674.832.2924.32246.1449816/翅*IS丹州1093773.41013S24.236.277.2142.3427C名称16开封532148.8439339.912.269.273S.89767C行整17洛阳712057.6677074.895.3718.83106.861055CIB新乡5640S3.44369612.762.810.SS
10、S0.9SB39Cwtzr19仪作672189.4663298.113.337.3123.41631CuXV值甘筌520濮阳398245.34S448.162.825.6925.59236C注释21二门倏677056.3672751.361.3811.0863.461624C/数据22济源392362.4877611.14.72.6413.91394C数3根 default(I)应用因子分析方法将10个自变量综合成合适的公因子。.factor xl x2 x3 x4 x5 x6 x7 x8 x9 xl。, pcf (obs=31)Number of obs = Retained factors
11、 = Number of params =Number of obs = Retained factors = Number of params =Factor analysis/correlationMethod: principal-component factorsRotation: (unrotated)FactorEigenvalueDifferenceProportionCumulativeFactorl2.638680.277260.26390.2639Factor22.361420.930890.23610.5000Factor31.430530.212030.14310.64
12、31Factor41.218500.353150.12190.7649Factors0.865350.269020.08650.8514Factor60.596330.121390.05960.9111Factor70.474940.284400.04750.9586Factors0.190540.057090.01910.9776Factor90.133450.043170.01330.9910Factorl0.090270.00901.0000LR test: independent vs. saturated: chi2(45) = 136.79 Probchi2 = 0.0000Fac
13、tor loadings (pattern matrix) and unique variancesVariableFactorlFactor2Factor3Factor4Uniquenessxl0.0367-0.41130.63900.53080.1394x20.81490.14250.3422-0.01230.1983x30.68930.57060.13120.11840.1680x4-0.36610.0572-0.58430.41920.3456x50.53070.0667-0.30360.04620.6196x6-0.11720.8379-0.32750.01620.1767x7-0.
14、24410.88370.22800.17580.0765x8-0.48710.58710.4922-0.09920.1659x90.47020.0662-0.18450.72880.2094X100.74220.0807-0.1371-0.41540.2513上述因子分析运行结果可以看出,利用特征值大于1的规则-共需要提取4个公因 子,但是4个公因子总共的方差贡献率仅为76.49%,需要提取5个公因子才能达到总方差 贡献率85.14%,因此,需要提取5个公因子。.factor xl x2 x3 x4 x5 x6 x7 x8 x9 xl0, factor(5) (obs=31)Factor an
15、alysis/correlationMethod: principal factors Rotation: (unrotated)Factor analysis/correlationMethod: principal factors Rotation: (unrotated)Number of obs=31Retained factors=5Number of params=40LR test: independent vs. saturated: chi2(45) = 136.79 Probchi2 = 0.0000FactorEigenvalueDifferenceProportionC
16、umulativeFactorl2.278730.161490.36840.3684Factor22.117241.073110.34230.7107Factor31.044130.303470.16880.8795Factor40.740660.399280.11970.9993Factors0.341370.230560.05521.0545Factor60.110810.059440.01791.0724Factor70.051370.141340.00831.0807Factors-0.089970.05654-0.01451.0661Factor9-0.146510.11597-0.
17、02371.0424Factorl-0.26248*-0.04241.0000Factor loadings (pattern matrix) and unique variancesVariableFactorlFactor2FactorsFactor4FactorsUniquenessxl0.0378-0.35380.65910.32360.19860.2948x20.79210.15700.3066-0.1256-0.15830.2131x30.67030.56270.12210.0710-0.11000.2020x4-0.29180.0315-0.29480.3149-0.20890.
18、6842x50.43420.0599-0.20560.12870.42640.5672x6-0.13690.7761-0.34140.11810.14690.2268x7-0.26040.86360.23650.11010.02590.1177x8-0.47130.53650.4022-0.22650.01180.2769x90.41000.0652-0.03580.5633-0.12300.4940X100.67100.0780-0.1897-0.32370.04230.4012运行结果中包括5个公因子的因子载荷矩阵,这时每个公因子的具体含义并不明确,需 要通过因子载荷矩阵的旋转来更加明确。
19、Number of obs=31Retained factors=5Number of params=40Factor analysis/correlationMethod: principal factorsRotation: orthogonal varimax (Kaiser off)LR test: independent vs. saturated: chi2(45) = 136.79 Probchi2 = 0.0000FactorVarianceDifferenceProportionCumulativeFactorl2.047440.029100.33100.3310Factor
20、22.018340.938560.32630.6573Factor31.079790.302600.17460.8319Factor40.777190.177840.12570.9576Factors0.599350.09691.0545Rotated factor loadings (pattern matrix) and unique variancesVariableFactorlFactor2Factor3Factor4FactorsUniquenessxl-0.0074-0.08560.83400.04740.00140.2948x20.8742-0.08430.12370.0115
21、-0.00730.2131x30.79920.2912-0.08660.22730.12350.2020x4-0.34G70.0608-0.20480.3708-0.12910.6842x50.2348-0.0983-0.03240.11690.59430.5672x6-0.01780.6653-0.46690.17810.28380.2268x70.07750.9330-0.04860.0495-0.03140.1177x8-0.09750.72650.0779-0.3509-0.23800.2769x90.2840-0.04750.11350.63020.11440.4940X100.61
22、37-0.2503-0.2813-0.13240.25070.4012通过旋转,可以归纳出:第1公因子主要反应x2,x3,xlO的信息;第2公因子主要反应x6,x7,x8的信息;第3公因子主要反应xl的信息;第4公因子主要反应x4,x9的信息;第5公因子主要反应x5的信息;.predict fl f2 f3 f4 regressionScoring coefficients (method = regression; based on varimax rotated factors)VariableFactorlFactor2Factor3Factor4Factorsxl0.006690.05
23、9320.645380.124860.37939x20.44717-0.064170.05563-0.13422-0.29449x30.442360.05234-0.017320.226210.10981x4-0.05193-0.03700-0.088740.25632-0.07132x5-0.02447-0.017410.04459-0.071910.36207x6-0.039080.16458-0.259240.154270.53536x7-0.064630.687210.137790.11666-0.11846x80.065180.169670.06297-0.42413-0.26559
24、x90.008310.005430.064450.42115-0.15558X100.18850-0.04223-0.14629-0.153850.27260利用回归方法,计算了 5个公因子的因子得分,并将其保存为fl,f4,f5,这5个变 量是聚类分析的依据。(2)将31个黄河中下游主要流经城市根据因子分析的结果进行聚类分析。聚类分析时,首先通过系统聚类方法进行预判断,根据初次运行结果确定最终希望得到的类别数。cluster singlelinkage fl f2 f3 f4 fS, name(clusl) listcity 呼和浩特X18923x269.8x3104719X4 6.46x5
25、1.27x610.5Sx7 78.65x8 18000x9 99.51xie40.31fl .6389137f2-.0676842f3f4fSclusl.-idclusl*rdclusl-hgt1.26954976826291555124132.7S6M83city 包头xl2188x2 83.57x3 138168x46.4x5 2.86x69.6x7104.64x8 30700x9 95.83xie 44.5fl1.747582f2 -.02257651.11.cluster dendrogram clusl, horizontalDendrogram for clusl cluster
26、analysis0123L2 dissimilarity measure从运行结果看,类别数是不明显的,因此可以尝试分为3类或4类。.cluster gen typel=group(3).cluster gen type2=group(4).list city typel type2citytypeltype21.呼和浩特342.包头343.鄂尔多斯114.乌海225.巴彦淖尔336 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1111 11112原中城州汾太音运忻临梁南安林州吕渭延榆郑阳乡作阳洛新焦濮21.22.23.24.25.26.27.28.29.30.31.门济济涌东峡源南博营3333344444济泰德聊滨宁安州城州3333344444有泽34从结果可以看出,鄂尔多斯、乌海、巴彦淖尔是3个比较特殊的地区,可以各自为一类,其他的地区属于一类。