《Stata 动态面板 GMM 操作英文案例.pdf》由会员分享,可在线阅读,更多相关《Stata 动态面板 GMM 操作英文案例.pdf(10页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、Using Arellano Bond Dynamic Panel GMM Estimators in Stata Tutorial with Examples using Stata 9.0(xtabond and xtabond2)Elitza Mileva,Economics Department Fordham University July 9,2007 1.The model The following model examines the impact of capital flows on investment in a panel dataset of 22 countrie
2、s for 10 years(1995 2004):Iit=1Ii,t1+2Kit+3Xit+uit.(1)In equation(1)above Iit is gross fixed capital formation as a percentage of GDP and Iit-1 is its lagged value.Kit is a matrix of the components of foreign resource flows FDI,loans and portfolio(equity and bonds)as percentage shares of GDP.Xit is
3、a matrix of the following control variables:lagged real GDP growth to account for the accelerator effect;the absolute value of one step ahead growth forecast errors as a measure of uncertainty;the change in the log terms of trade to gauge the price of imported capital goods;and,finally,the deviation
4、 of M2 from its three-year trend as a proxy for the liquidity available to finance investment.2.Why the Arellano Bond GMM estimator?Several econometric problems may arise from estimating equation(1):1.The capital flows variables in Kit are assumed to be endogenous.Because causality may run in both d
5、irections from capital inflows to investment and vice versa these regressors may be correlated with the error term.2.Time-invariant country characteristics(fixed effects),such as geography and demographics,may be correlated with the explanatory variables.The fixed effects are contained in the error
6、term in equation(1),which consists of the unobserved country-specific effects,vi,and the observation-specific errors,eit:1uit=vi+eit(2).3.The presence of the lagged dependent variable Iit-1 gives rise to autocorrelation.4.The panel dataset has a short time dimension(T=10)and a larger country dimensi
7、on(N=22).To solve problem 1(and problem 2)one would usually use fixed-effects instrumental variables estimation(two-stage least squares or 2SLS),which is what I tried first.The exogenous instruments I used were the following:the aggregate long-term capital inflows to the countries in our sample as a
8、 group as a percentage of the sum of their cumulative GDP(I labelled these regional flows),an index of financial openness and the EBRD transition index.However,the first-stage statistics of the 2SLS regressions showed that my instruments were weak.With weak instruments the fixed-effects IV estimator
9、s are likely to be biased in the way of the OLS estimators.Therefore,I decided to use the Arellano Bond(1991)difference GMM estimator first proposed by Holtz-Eakin,Newey and Rosen(1988).Instead of using only the exogenous instruments listed above lagged levels of the endogenous regressors in Kit(FDI
10、,loans and portfolio)are also added.This makes the endogenous variables pre-determined and,therefore,not correlated with the error term in equation(1).To cope with problem 2(fixed effects)the difference GMM uses first-differences to transform equation(1)into Iit=1Ii,t1+2Kit+3Xit+uit (3).(In general
11、form the transformation is given by:yit=yit1+x it+uit.)By transforming the regressors by first differencing the fixed country-specific effect is removed,because it does not vary with time.From equation(2)we get uit=vi+eit or uit ui,t1=(vivi)+(eitei,t1)=eitei,t1.The first-differenced lagged dependent
12、 variable(problem 3)is also instrumented with its past levels.2Finally,the Arellano Bond estimator was designed for small-T large-N panels(problem 4).In large-T panels a shock to the countrys fixed effect,which shows in the error term,will decline with time.Similarly,the correlation of the lagged de
13、pendent variable with the error term will be insignificant(see Roodman,2006).In these cases,one does not necessarily have to use the Arellano Bond estimator.3.Using the Arellano Bond difference GMM estimator in Stata 3.1 Import data into Stata The easiest way to get panel data into Stata is to organ
14、ize your Excel spreadsheet in the following way:ctryctry_dumyearinvgrowthuncerttotdev_m2 fin_integrtrans_indexfdiloansportfolioflows_eecaALB1199518.0008.9008.4440.215.3.0002.3330.861-0.0050.0001.121ALB1199621.0449.1006.614-0.112.3.0002.5190.9940.0500.0001.198ALB1199716.829-10.20012.2470.0579.4473.00
15、02.5190.580-0.0130.0001.783ALB1199816.29612.70016.8740.0193.2813.0242.5190.480-0.0190.0002.365ALB1199920.00510.1006.7830.0711.4443.0242.5570.389-0.0350.0001.826ALB1200024.7367.3003.750-0.0062.5723.0242.7781.245-0.0090.0001.488ALB1200129.2157.2004.023-0.0182.6543.0242.8141.648-0.0310.0001.263ALB12002
16、26.1563.4000.045-0.0102.9373.0242.8141.0020.0050.0001.718ALB1200325.0136.0023.7160.011-0.4553.0242.8141.225-0.0190.0001.894ALB1200423.6865.9002.5430.040-1.2983.0242.8892.7010.1880.0003.288ARM2199516.1546.90017.601-0.103-10.8082.0002.1120.3940.0000.0001.121ARM2199617.8855.8657.8720.3020.2613.0002.444
17、0.3090.0000.0331.198 Note that all observations(i.e.country 1 period 1;country 1 period 2;etc.)are stacked vertically and the variable are listed horizontally.Save the Excel worksheet as a text file(.txt,.csv,etc.).Open Stata and import the data by choosing File,Import,ASCII data created by spreadsh
18、eet,and click on the Browse button.Alternatively,you can type the following command in the command window,if your text file is located on the C drive:insheet using C:ABExampleData.txt(14 vars,220 obs)(Note that from now on text in blue will show Stata commands or their components.)3.2 Set the datase
19、t as a panel Next,save your dataset as a panel by selecting Statistics,Longitudinal/Panel data,Setup&Utilities,Declare dataset to be cross-sectional time series.Choose a variable that identifies the time dimension(year,in this example)and a variable that identifies the panel ID(ctry_dum,in this 3exa
20、mple).Stata needs a numerical variable for the panel ID so the variable ctry,which is a string variable,wont work.Alternatively,you can type the following command:tsset ctry_dum year panel variable:ctry_dum(strongly balanced)time variable:year,1995 to 2004 3.3 Stata command:xtabond Two ArellanoBond
21、estimators are available for Stata 9.0 one incorporated into Stata 9(called xtabond)and one proprietor program written by Roodman(2006)(called xtabond2).First is discussed the former(Stata 10.0 will have two AB estimators built in,including it version of the system estimator).Click on Statistics,Lon
22、gitudinal/Panel data,Dynamic panel data,Arellano Bond regression(RE).Stata displays a window,in which you can easily select the dependent variable,the endogenous and exogenous independent variables as well as the lags of the instruments.3.4 Stata command:xtabond2 Although the above-mentioned Stata m
23、enu option is easier to use,I have found Roodmans proprietary program(xtabond2)better it is more flexible and has a better help file and“how to do xtabond2”paper(see in the references).xtabond2 can do everything that xtabond does and has many additional features.See the Stata help file or the paper
24、for a description of the improvements offered by Roodmans program.The disadvantage of xtabond2 is that you actually have to type the program code there is no menu for it.Since xtabond2 is not an official command of Stata 9,it has to be downloaded from the Internet http:/ideas.repec.org/c/boc/bocode/
25、s435901.html or by typing the following command:ssc install xtabond2 If you have to download all xtabond2-related files from the repec website,make sure you save each file in the appropriate ado folder in your Stata folder,that is in the folder of the first letter of the file name as it is listed on
26、 the website.(xtabond2 may be directly available with Stata 10,or it may include a different system routine)4The following command shows you the help file:help xtabond2 Below is the command I used to estimate equation(1)followed by the Stata output:xtabond2 inv l.inv fdi loans portfolio l.growth unc
27、ert tot dev_m2,gmm(inv fdi loans portfolio,lag(2 2)iv(fin_integr trans_index flows_eeca l.growth uncert tot dev_m2)nolevel small Favoring space over speed.To switch,type or click on mata:mata set matafavor speed,perm.Warning:Number of instruments may be large relative to number of observations.Sugge
28、sted rule of thumb:keep number of instruments F =0.000 max=8-|Coef.Std.Err.t P|t|95%Conf.Interval-+-inv|L1.|.2922856 .111738 2.62 0.010 .0715819 .5129893 fdi|.5202847 .2094545 2.48 0.014 .1065725 .933997 loans|.2789421 .1638248 1.70 0.091 -.044643 .6025271 portfolio|-.0086876 .3376843 -0.03 0.980 -.
29、6756779 .6583028 growth|L1.|.1167961 .0555715 2.10 0.037 .0070319 .2265604 uncert|.0397982 .0673439 0.59 0.555 -.0932187 .172815 tot|.9193659 1.916147 0.48 0.632 -2.865388 4.704119 dev_m2|.0443079 .0760188 0.58 0.561 -.1058435 .1944594-Sargan test of overid.restrictions:chi2(31)=36.42 Prob chi2=0.23
30、1 Arellano-Bond test for AR(1)in first differences:z=-0.01 Pr z=0.992 Arellano-Bond test for AR(2)in first differences:z=-0.48 Pr z=0.628 As you can see,the command xtabond2 is followed by the dependent variable(inv)and the list of all right-hand-side variables:xtabond2 inv l.inv fdi loans portfolio
31、 l.growth uncert tot dev_m2 The lag operator is given by l.as in l.inv or l2.inv for 2 lags of inv.5After the comma are given two lists of variables.gmm()(or gmmstyle()lists the endogenous variables,which are instrumented with GMM-style instruments,i.e.lagged values of the variables in levels:gmm(in
32、v fdi loans portfolio,lag(2 2)With lag(2 2)I have instructed Stata to use only the second lag of the endogenous variables as instruments.Due to the small number of countries in my sample a large number of instruments causes the Sargan test(explained below)to be weak.The rule of thumb is to keep the
33、number of instruments less than or equal to the number of groups.Stata warns you about that at the top of the output table.The second lag is required,because it is not correlated with the current error term,while the first lag is.Generally,one can experiment with a second or deeper lags to find a go
34、od instrument,but using deeper lags reduces sample size.If the number of countries is large enough,one may use all available lags(second and deeper lags)as instruments.The second list of explanatory variables,iv()(or ivstyle(),lists all strictly exogenous variables(l.growth,uncert,tot,dev_m2)as well
35、 as the additional instrumental variables(fin_integr,trans_index,flows_eeca),which are not part of equation(1)and,therefore,are not listed before the comma in the Stata command.What this option essentially does for the included exogenous variables is tell Stata to use the variables themselves as the
36、ir own instruments.iv(fin_integr trans_index flows_eeca l.growth uncert tot dev_m2)Growth is lagged in this case due to economic theory and not because it is required by the regression.Another advantage of xtabond2 is that it actually allows you to use lag operators in the instruments matrix,while x
37、tabond does not.nolevel(or noleveleq)tells Stata to apply the difference GMM estimator.By default xtabond2 will apply the system GMM,if you dont specify nolevel.(System GMM is discussed next.)small tells Stata to use the small-sample adjustment and report t-instead of z-statistics and the Wald chi-s
38、quared test instead of the F test.Stata offers additional options not shown in the example above:twostep specifies that the two-step estimator is calculated instead of the default one-step.In two-step estimation,the standard covariance matrix is robust to panel-specific autocorrelation and 6heterosk
39、edasticity,but the standard errors are downward biased.Use twostep robust to get the finite-sample corrected two-step covariance matrix.robust specifies that the resulting standard errors are consistent with panel-specific autocorrelation and heteroskedasticity in one-step estimation.By default Stat
40、a also reports three additional tests:Sargan test,AR(1)and AR(2)tests.The Sargan test has a null hypothesis of“the instruments as a group are exogenous”.Therefore,the higher the p-value of the Sargan statistic the better.In robust estimation Stata reports the Hansen J statistic instead of the Sargan
41、 with the same null hypothesis.The Arellano Bond test for autocorrelation has a null hypothesis of no autocorrelation and is applied to the differenced residuals.The test for AR(1)process in first differences usually rejects the null hypothesis(though not in my example),but this is expected since ei
42、t=eitei,t1 and both have.ei,t1=ei,t1ei,t2ei,t1 The test for AR(2)in first differences is more important,because it will detect autocorrelation in levels.Before closing Stata you can save the data file in.dta format,which is the Stata data format.Choose File,Save As or type:save“C:ABExample.dta”When
43、you open that file next time,all settings,such as the panel-data setting,or any new variables you have created will be saved.4.Using the Arellano Bond system GMM estimator in Stata Sometimes the lagged levels of the regressors are poor instruments for the first-differenced regressors.In this case,on
44、e should use the augmented version “system GMM”.The system GMM estimator uses the levels equation(e.g.equation(1)in this example)to obtain a system of two equations:one differenced and one in levels.By adding the second equation additional instruments can be obtained.Thus the variables in levels in
45、the second equation are instrumented with their own first differences.This usually increases efficiency.Below is the command and Stata output for Arellano Bond System GMM estimator.Note that nolevel no longer is included after the comma in the command and Stata defaults to the system GMM.Including t
46、he equation in levels does not difference out the constant,therefore,if the model does not call for a constant,type noconst after the comma in the command.7 xtabond2 inv l.inv fdi loans portfolio l.growth uncert tot dev_m2,gmm(inv fdi loans portfolio,lag(3 3)iv(fin_integr trans_index flows_eeca l.gr
47、owth uncert tot dev_m2)small noconst Favoring space over speed.To switch,type or click on mata:mata set matafavor speed,perm.Warning:Number of instruments may be large relative to number of observations.Suggested rule of thumb:keep number of instruments F =0.000 max=9-|Coef.Std.Err.t P|t|95%Conf.Int
48、erval-+-inv|L1.|.898973 .025188 35.69 0.000 .8492693 .9486767 fdi|.9096135 .2207568 4.12 0.000 .4739929 1.345234 loans|.1813443 .1994594 0.91 0.364 -.2122499 .5749386 portfolio|-.697416 .4072526 -1.71 0.089 -1.50105 .1062178 growth|L1.|.1028205 .0514219 2.00 0.047 .0013493 .2042917 uncert|.1431728 .
49、0564829 2.53 0.012 .0317148 .2546307 tot|-2.131275 2.446291 -0.87 0.385 -6.958554 2.696004 dev_m2|.0076649 .081528 0.09 0.925 -.1532147 .1685446-Sargan test of overid.restrictions:chi2(55)=43.77 Prob chi2=0.862 Arellano-Bond test for AR(1)in first differences:z=-1.88 Pr z=0.061 Arellano-Bond test fo
50、r AR(2)in first differences:z=-0.86 Pr z=0.391 As the output table above shows,using system GMM increased efficiency.There are,however,two important points to be made about using system GMM.First,because system GMM uses more instruments than the difference GMM it may not be appropriate to use system