《Count Data Models with Correlated.docx》由会员分享,可在线阅读,更多相关《Count Data Models with Correlated.docx(23页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、 Scandinavian Journal of Statistics, Vol. 37: 382402, 2010 doi: 10.1111/j.1467-9469.2010.00689.x 2010 Board of the Foundation of the Scandinavian Journal of Statistics. Published by Blackwell Publishing Ltd. Count Data Models with Correlated Unobserved Heterogeneity STEFAN BOES Socioeconomic Institu
2、te, University of Zurich ABSTRACT. As previously argued, the correlation between included and omitted regressors generally causes inconsistency of standard estimators for count data models. Non-linear instru- mental variables estimation of an exponential model under conditional moment restrictions i
3、s one of the proposed remedies. This approach is extended here by fully exploiting the model assumptions and thereby improving efciency of the resulting estimator. Empirical likelihood in particular has favourable properties in this setting compared with the two-step generalized method of moments, a
4、s demonstrated in a Monte Carlo experiment. The proposed method is applied to the estimation of a cigarette demand function. Key words: approximating functions, instrumental variables, non-parametric likelihood, optimal instruments, Poisson model, semiparametric efciency 1. Introduction Regression m
5、odels for count data have become a standard tool in empirical work with appli- cations in all areas of specialization. Examples include the number of patents applied for by a rm (Hausman et al., 1984), the number of supreme court appointments (King, 1987), the number of epileptic seizures (Thall & V
6、ail, 1990), the number of doctor visits (Pohlmeier & Ulrich, 1995), the number of children born to a woman (Winkelmann & Zimmermann, 1994), the number of days a worker is absent from his job (Delgado & Kniesner, 1997) and the number of cubes in a tower building test as a measure of ne motor developm
7、ent of children (Cheung, 2002). The basic empirical model in most applications is the Poisson regression model. The Pois- son model assumes that the count variable Y follows a Poisson distribution given a vector of observed variables X , formally Y | X Poisson( X ), with log-linear specication of th
8、e intensity parameter X . Although certainly being appropriate in many cases, the Poisson model may not always display the true data-generating process. For example, it presumes that the researcher is able to account for the full amount of individual heterogeneity just by including X , additional un
9、observed heterogeneity is not allowed for and ruled out by the model assumptions. Various generalizations have been proposed that account for such unob- served heterogeneity. The standard approaches employ mixture distributions, either para- metrically by introducing, for example, Gamma-distributed
10、heterogeneity (the negative binomial models), or semiparametrically without specifying the form of the mixing distri- bution (Gurmu et al., 1998). Winkelmann (2008) gives an overview. Mullahy (1997) extends the literature to the important case when independence between observed and unobserved hetero
11、geneity fails, for example, owing to endogeneity. He considers the conditional expectation function E(Y | X , ), specied as the exponential of a linear predictor X with multiplicative unobserved heterogeneity . Mullahy (1997) points out that, given non-zero correlation between X and , standard estim
12、ators like Poisson pseudo- maximum likelihood (PML) or non-linear least squares will generally be inconsistent for because the usual residual function is not orthogonal to X. Also, a non-linear instrumental Count data and unobserved heterogeneity 383 Scand J Statist 37 2010 Board of the Foundation o
13、f the Scandinavian Journal of Statistics. variables strategy based on this residual function will be inconsistent owing to the non- separability of X and . Fortunately, a simple transformation of the model yields a residual function (Y , X ; ) that is additively separable in X and , and the assumpti
14、on of mean independence between the latter and the vector of instruments Z can be used to construct conditional moment restrictions E (Y , X ; ) | Z = 0. As proposed by Mullahy (1997), estimation can then be based on the generalized method of moments (GMM) using moment functions g (Y , X , Z; ) = a
15、(Z) (Y , X ; ) for some function a (Z), and the resulting estimator for will be consistent and asymptotically normally distributed. The estimator is not necessarily efcient, though, because its asymptotic variance depends on the choice of a (Z). The aim of this article is to extend Mullahys (1997) a
16、pproach using optimal instruments a(Z) that fully utilize the information given by the conditional moment restrictions. Com- pared with Mullahys work, this article makes a formal statement of how the optimal instru- ment matrix should be chosen. This provides an important guideline for practitioners
17、 who estimate exponential models with potentially endogenous regressors. Moreover, the article emphasizes the importance of model assumptions, in particular the assumptions on the instru- ment vector Z, rst for the estimation procedure itself, and second for the properties of the resulting estimator
18、. To the best of my knowledge, this has not sufciently been considered in the previous literature. The article proceeds as follows. The model and moment conditions will be laid out in the next section. Special attention will be given to the construction of the optimal instrument matrix a(Z). Section
19、 3 discusses the estimation methods, GMM and empirical likelihood (EL) estimation, in the given model context. Section 4 compares the properties of the esti- mators in a simulated data environment. The results indicate advantages of the EL estimator over the two-step GMM estimator in terms of (small
20、 sample) bias and efciency. Section 5 applies the methods to estimate a cigarette demand function. Fully exploiting the model assumptions considerably improves efciency. For example, approximating the optimal func- tion a(Z) for only one instrument more than doubles the t-statistic for the parameter
21、 of interest compared with the baseline instrument specication. 2. Exponential model, heterogeneity and moment conditions Let Y denote a random variable with support being the non-negative integers, X denote a k 1 vector of explanatory variables (including a constant) and Z denote a q 1 vec- tor of
22、instruments (q k) with properties to be dened next. Assume that n observations of (Y , X , Z) form a random sample of the population, and suppose that the main objective is to estimate the effect of elements of X on the conditional expectation E(Y | X ). Specically, the data-generating process is as
23、sumed to be consistent with the conditional expectation function E(Y | X , ; ) = exp(X ) , (1) where is the k 1 vector of unknown parameters, and = exp( ) 0 is an unobserved random variable. The specication of the conditional expectation function explicitly accounts for observed heterogeneity (throu
24、gh X ) and unobserved heterogeneity (through ). Without loss of generality the normalization E( ) = 1 can be invoked if a constant term is included in X. Note that observable and unobservable characteristics are treated symmetrically in (1) because the conditional expectation function is log-linear
25、in both X and . The specic form of the conditional expectation function might appear restrictive at rst, but there is no a priori reason for X and to enter asymmetrically. Moreover, the linear index X is sufciently 384 S. Boes Scand J Statist 37 2010 Board of the Foundation of the Scandinavian Journ
26、al of Statistics. i exible to approximate any non-linear function in the regressors arbitrarily close, and the exponential function ensures (1) to be positive, as required for a count-dependent variable. Strictly speaking, it is not necessary for (1) to be fullled that Y is a count. What follows is
27、equally relevant to any other data-generating process consistent with such an exponen- tial conditional expectation function. An exponential function with continuous Y was used, for example, in Mullahy (1997) where the dependent variable is the birthweight. Exponential functional forms should also b
28、e used to estimate gravity equations (Santos Silva & Tenreyro, 2006). The specication of the conditional expectation function implies that Y = exp(X ) + , (2) where the regression error has the property E( | X , ) = 0, by construction. Windmeijer & Santos Silva (1997) consider estimation of models l
29、ike (2) in situations where some of the regressors may be simultaneously determined with the dependent count. In this case, there is a crucial distinction between additive and multiplicative (for that matter structural) errors, the two otherwise being observationally equivalent (Wooldridge, 1992). G
30、rogger (1990) dis- cusses the additive approach and testing for exogeneity of the regressors using a Hausman- type test. In the given context, it is natural to maintain the notation in (2) to distinguish between regression error and unobservable characteristics, the latter not being accounted for in
31、 the regression and potentially correlated with X. Mullahy (1997) gives conditions for consistent estimation of in such a model. In a nutshell, if and X are mean independent, then PML estimation of the Poisson model is consistent for (Gourieroux et al., 1984; Wooldridge, 1997). On the contrary, if m
32、ean independence fails, then PML will generally be inconsistent, and estimation by instrumental variables based on appropriately dened residuals is suggested alternatively. Mullahy (1997) imposes two key assumptions on the instruments Z: E( | Z) = E( ) and E(Y | X , , Z) = E(Y | X , ). (3) The rst a
33、ssumption is an independence condition that and Z must be mean independent. The second assumption imposes an exclusion restriction on the conditional expectation func- tion which implies for the regression error that E( | X , Z, ) = 0. With the assumptions on Z, a conditional moment restriction can
34、be constructed via the residual function (Y , X ; ) = Y exp(X ) 1 as E (Y , X ; ) | Z = EY exp(X ) 1 | Z = 0 (4) by iterated expectations. As noted by Mullahy (1997), the crucial step in deriving such a residual function is that needs to be additively separable from X which can be achieved by dividi
35、ng both sides of (2) by exp(X ). The conditional moment restriction is assumed to uniquely identify the true parameter value . Now let a(Z) denote a matrix-valued func- tion of Z with dimension s k, which in the simplest case is the identity function a(Z) = Z. It is common practice to derive uncondi
36、tional (population) moment restrictions from (4) as Ea(Z) (Y , X ; ) = 0, (5) and the estimator of is obtained as the solution to sample analogues a(zi ) (yi , xi ; ) = 0, with estimation operationalized, for example, in a GMM or non-linear instrumental vari- ables framework. Such a procedure, howev
37、er, is suboptimal for at least two reasons. First, the conditional moment restriction is stronger than the unconditional ones implying that an estimator based on the latter does not necessarily exploit all the available information. Count data and unobserved heterogeneity 385 Scand J Statist 37 2010
38、 Board of the Foundation of the Scandinavian Journal of Statistics. Second, the procedure is only valid under the presumption that a(Z) (or in the simplest case Z) identies , which must not necessarily be so; see Dominguez & Lobato (2004). In constructing the optimal instrument matrix a(Z) both thes
39、e issues need to be taken into account. More formally, let D(Z) = E (Y, X ; )/ | Z denote the Jacobian, and let V(Z) = E (Y, X ; )2 | Z denote the variance obtained from the conditional moment re- striction in (4). Chamberlain (1987) shows that the asymptotic efciency bound for any n- consistent sem
40、iparametric estimator based on (4) is given by I1 = EZ D(Z)V(Z)1D(Z)1. This efciency lower bound is derived under the assumption of i.i.d. data following a multi- nomial distribution, in which case the usual parametric efciency bound applies. As any distribution can be approximated arbitrarily close
41、 by the multinomial distribution, and the efciency bound does not depend on the support of the distribution, the bound derived under the multinomial distribution also applies in the general semiparametric case. An optimal GMM estimator based on the unconditional moment restrictions in (5) that attai
42、ns the semiparametric efciency bound requires instruments a(Z) = D(Z)V(Z)1 (Newey, 1993, among others). In general, such an estimator is not feasible as both expecta- tions forming a(Z) are unknown. It is shown in Chamberlain (1987) that a GMM estimator based on a particular sequence of unconditiona
43、l moment restrictions may come arbitrarily close to the semiparametric efciency bound. Related to this idea, Donald et al. (2003) use a series of functions of Z to form unconditional moment restrictions, and let the dimension K of the vector of approximating functions grow with the sample size. Let
44、qK (Z) denote such a vector. Under relatively weak regularity conditions, mainly including that second moments exist and are nite, and that K grows sufciently large, the sequence of unconditional moment restrictions EqK (Z) (Y , X ; ) = 0 (6) is equivalent to the conditional moment restriction in (4
45、). This is the important step to obtain unconditional moments from the model assumptions. Semiparametric efciency is established if linear combinations of qK (Z) can approximate a(Z), with approximation error diminish- ing as K grows, as the asymptotic variance of the optimal GMM estimator with inst
46、ruments a(Z) reaches the semiparametric efciency bound (Newey, 1993). Donald et al. (2003) suggest using splines as approximating functions. If Z is univariate, the sth order spline with knots t1, ., tKs1 is given by qK (Z) = (1, Z, ., Zs , 1(Z t1)Zs , ., 1(Z tK s 1)Zs ) (7) with indicator function
47、1(). Common choice is s = 3 for cubic splines. For example, with three knots and cubic splines, the vector of approximating functions is q7(Z) = (1, Z, Z2, Z3, 1(Z t1)Z3, 1(Z t2 )Z3, 1(Z t3)Z3) where the knots t1, t2 and t3 could be the 0.25-quantile, the median and the 0.75-quantile of Z, respectiv
48、ely. For Z multivariate, the approximating functions may be generated by products of univariate splines for each element of Z. See de Boor (2001) for the theoretical background and further details. The method can be easily implemented in existing proce- dures that utilize unconditional moment restrictions, a potential advantage over alternative approaches such as Kitamura et al. (2004) and Dominguez & Lobato (2004). 386 S. Boes Scand J Statist 37 2010 Board of the Foundation of the Scandinavian Journal of Statistics. i = 1 i = 1 i