《《计量经济学导论》ch.ppt》由会员分享,可在线阅读,更多相关《《计量经济学导论》ch.ppt(18页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Chapter 9 More on Specification and Data Issues Wooldridge:Introductory Econometrics:A Modern Approach,5e 2012 Cengage Learning.All Rights Reserved.May no
2、t be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Tests for functional form misspecificationOne can always test whether explanatory should appear as squares or higher order terms by testing whether such terms can be excludedOtherwise,one can use general
3、 specification tests such as RESETRegression specification error test(RESET)The idea of RESET is to include squares and possibly higher order fitted values in the regression(similarly to the reduced White test)Test for the exclusion of these terms.If they cannot be exluded,this is evidence for omitt
4、ed higher order terms and interactions,i.e.for misspecification of functional form.Multiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Example:Housi
5、ng price equationDiscussionOne may also include higher order terms,which implies complicated interactions and higher order terms of all explanatory variablesRESET provides little guidance as to where misspecification comes fromEvidence formisspecificationLess evidence formisspecificationMultiple Reg
6、ression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Testing against nonnested alternativesDiscussionCan always be done;however,a clear winner need not emergeCan
7、not be used if the models differ in their definition of the dep.var.Model 1:Model 2:Define a general model that contains both models as subcases and test:Which specificationis more appropriate?Multiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May no
8、t be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Using proxy variables for unobserved explanatory variablesExample:Omitted ability in a wage equationGeneral approach to using proxy variablesIn general,the estimates for the returns to education and expe
9、rience will be biased because one has omit the unobservable ability variable.Idea:find a proxy variable for ability which is able to control for ability differences between individuals so that the coefficients of the other variables will not be biased.A possible proxy for ability is the IQ score or
10、similar test scores.Replace by proxy Omitted variable,e.g.abilityRegression of the omitted variable on its proxyMultiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in w
11、hole or in part.Assumptions necessary for the proxy variable method to workThe proxy is just a proxy“for the omitted variable,it does not belong into the population regression,i.e.it is uncorrelated with its errorThe proxy variable is a good“proxy for the omitted variable,i.e.using other variables i
12、n addition will not help to predict the omitted variable If the error and the proxy were correlated,the proxy would actually have to be included in the population regression functionOtherwise x1 and x2 would have to be included in the regression for the omitted variableMultiple Regression Analysis:S
13、pecification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Under these assumptions,the proxy variable method works:Discussion of the proxy assumptions in the wage exampleAssumption 1:Sh
14、ould be fullfilled as IQ score is not a direct wage determinant;what matters is how able the person proves at workAssumption 2:Most of the variation in ability should be explainable by variation in IQ score,leaving only a small rest to educ and experIn this regression model,the error term is uncorre
15、lated with all explanatory variables.As a consequence,all coefficients will be correctly estimated using OLS.The coefficents for the explanatory variables x1 and x2 will be correctly identified.The coefficient for the proxy va-riable may also be of interest(it is a multiple of the coefficient of the
16、 omitted variable).Multiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.As expected,the measured return to education decreases if IQ is included as a
17、 proxy for unobserved ability.The coefficient for the proxy suggests that ability differences between indivi-duals are important(e.g.+15 points IQ score are associated with a wage increase of 5.4 percentage points).Even if IQ score imperfectly soaks up the variation caused by ability,inclu-ding it w
18、ill at least reduce the bias in the measured return to education.No significant interaction effect bet-ween ability and education.Multiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly acces
19、sible website,in whole or in part.Using lagged dependent variables as proxy variablesIn many cases,omitted unobserved factors may be proxied by the value of the dependent variable from an earlier time periodExample:City crime ratesIncluding the past crime rate will at least partly control for the ma
20、ny omitted factors that also determine the crime rate in a given yearAnother way to interpret this equation is that one compares cities which had the same crime rate last year;this avoids comparing cities that differ very much in unobserved crime factorsMultiple Regression Analysis:Specification and
21、 Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Models with random slopes(=random coefficient models)Average interceptRandom componentAverage slopeRandom componentAssumptions:Error termThe i
22、ndividual random com-ponents are independent of the explanatory variableThe model has a random intercept and a random slope WLS or OLS with robust standard errors will consistently estimate the average intercept and average slope in the populationMultiple Regression Analysis:Specification and Data I
23、ssues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Properties of OLS under measurement errorMeasurement error in the dependent variableConsequences of measurement error in the dependent variableEstima
24、tes will be less precise because the error variance is higherOtherwise,OLS will be unbiased and consistent(as long as the mea-surement error is unrelated to the values of the explanatory variables)Mismeasured value=True value+Measurement errorPopulation regressionEstimated regressionMultiple Regress
25、ion Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Measurement error in an explanatory variableMismeasured value=True value+Measurement errorPopulation regressionE
26、stimated regressionClassical errors-in-variables assumption:The mismeasured variable x1 is cor-related with the error term!Error unrelated to true valueMultiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or post
27、ed to a publicly accessible website,in whole or in part.Consequences of measurement error in an explanatory variableUnder the classical errors-in-variables assumption,OLS is biased and inconsistent because the mismeasured variable is endogenousOne can show that the inconsistency is of the following
28、form:The effect of the mismeasured variable suffers from attenuation bias,i.e.the magnitude of the effect will be attenuated towards zeroIn addition,the effects of the other explanatory variables will be biasedThis factor(which involves the error variance of a regression of the true value of x1 on t
29、he other explanatory variables)will always be between zero and oneMultiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Missing data and nonrandom sam
30、plesMissing data as sample selectionMissing data is a special case of sample selection(=nonrandom samp-ling)as the observations with missing information cannot be usedIf the sample selection is based on independent variables there is no problem as a regression conditions on the independent variables
31、In general,sample selection is no problem if it is uncorrelated with the error term of a regression(=exogenous sample selection)Sample selection is a problem,if it is based on the dependent variable or on the error term(=endogenous sample selection)Multiple Regression Analysis:Specification and Data
32、 Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Example for exogenous sample selectionExample for endogenous sample selectionIf the sample was nonrandom in the way that certain age groups,income
33、groups,or household sizes were over-or undersampled,this is not a problem for the regression because it examines the savings for subgroups defined by income,age,and hh-size.The distribution of subgroups does not matter.If the sample is nonrandom in the way individuals refuse to take part in the samp
34、le survey if their wealth is particularly high or low,this will bias the regression results because these individuals may be systematically different from those who do not refuse to take part in the sample survey.Multiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Ri
35、ghts Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Outliers and influential observationsExtreme values and outliers may be a particular problem for OLS because the method is based on squaring deviationsIf outliers are the result of mi
36、stakes that occured when keying in the data,one should just discard the affected observationsIf outliers are the result of the data generating process,the decision whether to discard the outliers is not so easyExample:R&D intensity and firm size Multiple Regression Analysis:Specification and Data Is
37、sues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Example:R&D intensity and firm size(cont.)The regression without the outlier makes more sense.The outlier is not the result of a mistake:One of the sa
38、mpled firms is much larger than the others.Multiple Regression Analysis:Specification and Data Issues 2012 Cengage Learning.All Rights Reserved.May not be scanned,copied or duplicated,or posted to a publicly accessible website,in whole or in part.Least absolute deviations estimation(LAD)The least ab
39、solute deviations estimator minimizes the sum of absolute deviations(instead of the sum of squared deviations,i.e.OLS)It may be more robust to outliers as deviations are not squaredThe least absolute deviations estimator estimates the parameters of the conditional median(instead of the conditional mean with OLS)The least absolute deviations estimator is a special case of quantile regression,which estimates parameters of conditional quantilesMultiple Regression Analysis:Specification and Data Issues