Abstract

There are copula-based statistical models in the literature for regression with dependent data such as clustered and longitudinal overdispersed counts, for which parameter estimation and inference are straightforward. For situations where the main interest is in the regression and other univariate parameters and not the dependence, we propose a “weighted scores method”, which is based on weighting score functions of the univariate margins. The weight matrices are obtained initially fitting a discretized multivariate normal distribution, which admits a wide range of dependence. The general methodology is applied to negative binomial regression models. Asymptotic and small-sample efficiency calculations show that our method is robust and nearly as efficient as maximum likelihood for fully specified copula models. An illustrative example is given to show the use of our weighted scores method to analyze utilization of health care based on family characteristics.

1. INTRODUCTION

Modeling clustered and longitudinal continuous response data is straightforward based on the multivariate normal (MVN) distribution, applied to a possibly transformed response. This is not the case for discrete responses, such as counts or ordinal variables. Combining copulas (multivariate uniform distributions) with standard univariate models for discrete responses seems to be a promising solution. Exchangeable copulas could be used to model clustered data. For longitudinal discrete data, copulas with dependence decreasing in time lag can be used.

The MVN copula generated by MVN distribution inherits the useful properties of the latter and thus allows a wide range for dependence and overcomes the drawback of limited dependence of simple parametric families of copulas. Implementation of MVN copula for discrete data (discretized MVN) is possible but not easy because multidimensional integration is needed for computing MVN rectangle probabilities. Therefore, special estimation methods may be required in order to overcome the computational complexities. The use of composite likelihood methods (see Zhao and Joe, 2005; Varin, 2008) can be a potential path toward implementation of discretized MVN and other such models.

In this paper, to generalize existing methodology in the biostatistics literature for regression models with dependent data, we concentrate on inference for the univariate parameters with dependence treated as nuisance, that is, we assume that joint and conditional probabilities are not of interest. The efficiency of estimating the univariate parameters using a composite likelihood based on the sum of univariate log-likelihoods can be low when dependence within clusters is strong. We will improve the efficiency by inserting weight matrices that depend on covariances of the scores assuming a “working model”, such as discretized MVN. We call this method “weighted scores”.

The weighted scores method is an extension of the generalized estimating equations (GEEs) (Liang and Zeger, 1986) since it can also be applied to families that are not in the class of generalized linear regression models. The GEE method is a nonlikelihood approach based on a “working correlation” matrix. But for nonnormal variables, Pearson's correlations have constraints that depend on the univariate margins, which the GEE method ignores (Sabo and Chaganty, 2010). Furthermore, in general the specified working correlation matrix may not correspond to any multivariate distribution for binary or count data, so regression parameter estimates and p-values may lack a theoretical probabilistic basis. In the absence of a multivariate distribution with specified working correlation matrix, broad claims of consistency of GEE estimates are incorrect (Lee and Nelder, 2009). See Chaganty and Joe (2004), Chaganty and Joe (2006) and Lindsey and Lambert (1998) for further shortcomings of the GEE method. Our new method not only generalizes but also overcomes the theoretical flaws associated with the GEE procedure because our working model is a proper multivariate model and the parameters in the weight matrices are interpretable as latent correlations. Furthermore, we demonstrate that our weighted scores method is highly efficient when compared with the “gold standard” maximum likelihood (ML) estimates arising from proper multivariate models.

To illustrate the method of weighted scores concretely, we use negative binomial regression models to handle cluster and longitudinal count data with overdispersion. For longitudinal count data with overdispersion, Thall and Vail (1990) and Solis-Trapala and Farewell (2005) have proposed covariance models with robust estimation of the covariance matrix, but in these models, negative values for the covariances are not allowed, although multivariate discrete data can exist with some negative associations. Some realistic situations with negative dependence appeared in Chib and Winkelmann (2001). Furthermore, autocorrelation structures such as AR(1), which are very realistic in longitudinal count data, cannot be adapted. Our approach allows flexibility through the MVN copula and more than one form of marginal negative binomial regression as in Cameron and Trivedi (1998) and therefore is superior to the aforementioned methods because it provides (a) robust estimation, (b) more dependence structures, and (c) flexibility in the sense that covariates can be incorporated into both parameters of the negative binomial distribution.

The remainder of the paper proceeds as follows. Section 2 introduces the general theory of weighted scores method, and Section 3 gives details for the special case of negative binomial regression. Section 4 discusses the use of the discretized MVN working model to obtain the weights and asymptotic properties of the estimates. Section 5 studies asymptotic efficiency of our method as compared to ML using other multivariate copula models with negative binomial regression. Section 6 presents an application of our methodology to analyze dependent overdispersed count data. We conclude this article with some discussion.

2. WEIGHTED SCORES: GENERAL THEORY

The main idea of weighted scores estimating equations is to write out the score equations for independent data within clusters or panels and then generalize to estimating equations by inserting weight matrices between the matrix of covariates and the vector of scores for regression and nonregression parameters. The general theory is outlined here and made concrete for negative binomial regression models in Section 3.

For ease of exposition, let d be the dimension of a “cluster” or “panel” and n the number of clusters. The theory can be extended to varying cluster sizes. Let p be the number of covariates, that is, the dimension of a covariate vector x. Let γ of dimension q be the vector of univariate parameters that are not regression coefficients. The response Y is assumed to have density f(·;ν,γ), where ν = η(x;β) is a function of x and the p-dimensional regression vector β. Usually, ν = xTβ or η(·) is a known function.

Suppose that data are (yij,xij), j = 1,…,d, i = 1,…,n, where i is an index for individuals or clusters and j is an index for the repeated measurements or within-cluster measurements. The first component of each xij is taken as 1 for regression with an intercept. The univariate marginal model for Yij is f(·;νij,γ), where νij = η(xij;β). For a multivariate model, one would need a joint distribution of (Yi1,…,Yid). If for each i, Yi1,…,Yid are independent, then the log-likelihood is 
graphic
where (·) = logf(·). The score equations for β and γ are 
graphic
(2.1)
where Iq is an identity matrix of dimension q. Let graphic where aT = (βT,γT) is the column vector of all r = p + q univariate parameters. The score equations (2.1) can be written as 
graphic
(2.2)

where XiT = (Xi1T,…,XidT) and siT(a) = (si1T(a),…,sidT(a)). The vectors sij(a) and si(a) have dimensions (1 + q) and d(1 + q), respectively. The dimensions of Xij and Xi are (1 + qr and d(1 + qr, respectively.

For estimation of a when Yi1,…,Yid are dependent and a multivariate model is not used, we consider weighted scores equations that generalize (2.2) as 
graphic
(2.3)
where Wi are invertible d(1 + qd(1 + q) matrices. In (2.3), for initial analysis, we consider Wi as fixed. The asymptotic covariance matrix is used to obtain an optimal choice for Wi for further analysis in the subsequent sections. The asymptotic covariance matrix for the estimator that solves (2.3) is V = ( − Dg) − 1Mg( − DgT) − 1, where Mg = C o v(g) = ∑i = 1nXiTWi − 1Ωi[WiT] − 1Xi and − Dg = E(g/aT) = ∑i = 1nXiTWi − 1Ψi, for g in (2.3). Here Ψi = − E(si(a)/aT) and Ωi = C o v(si(a)). The dimensions of the matrices are as follows: Dg, Mg are r×r, Ωi is d(1 + qd(1 + q), and Ψi is d(1 + qr. Thus, 
graphic
The matrix Cauchy–Schwarz inequality (Chaganty and Joe, 2004) shows that the optimal choice of Wi satisfies XiTWi − 1 = ΨiTΩi − 1, leading to 
graphic
(2.4)

Thus, the optimal choice of Wi depends on the “true multivariate model” via Ψi and Ωi.

3. OPTIMAL WEIGHTS FOR NEGATIVE BINOMIAL REGRESSION

For overdispersed count data, the sample variance is larger than the sample mean, that is, the dispersion index (=variance/mean) is greater than 1. The negative binomial distribution NB(τ,ξ) allows for overdispersion, and its probability mass function is 
graphic

with mean μ = τξ, variance τξ(1 + ξ), and dispersion index 1 + ξ. In this parameterization, τ is the convolution parameter and ξ = 1/π − 1, where π is the Bernoulli parameter.

Cameron and Trivedi (1998) present the NBk(μ,γ) parameterization, where τ = μ2 − kγ − 1 and ξ = μk − 1γ, 1 ≤ k ≤ 2. For the NBk model, we let ν = logμ = xTβ depend on the covariates x, and consequently either or both the parameters τ and ξ of the negative binomial are covariate dependent. We will primarily use the NB1 parameterization (τ = μγ − 1,ξ = γ) and the NB2 parameterization (τ = γ − 1,ξ = μγ); the latter is the same as in Lawless (1987). For the NB1 model, τ depends on covariates while ξ = γ does not and thus the dispersion index is constant. For the NB2 model, τ = 1/γ is constant while ξ is a function of the covariates and therefore the dispersion index varies with the covariates ranging from 1 + γmin[exp(xTβ)] to 1 + γmax[exp(xTβ)]. For 1 < k < 2, one could interpolate between these 2 models using the NBk parameterization, for which the covariates affect both parameters of the negative binomial. Let (ν,γ,y) = logf(y;τ,ξ) for the re-parameterized negative binomial model.

Suppose that for each 1 ≤ in and 1 ≤ jd, Yij is distributed as negative binomial with mean μij and parameter γ. Assume that νij = log(μij) = xijTβ. In this case, η(xij;β)/β = xij and q = 1. To obtain Ψi = − E(si(a)/aT), first note that 
graphic
where = (νij,γ,yij). With Δij = − E(Dij), we have 
graphic
(3.1)

where Δi = diag(Δi1,…,Δid) is a symmetric 2d×2d matrix, as q = 1. Explicit expressions for the elements of the matrix Δij, the log-likelihood = (ν,γ,y), its derivatives /ν, /γ, and the negative expectations of the second derivatives 2/ν2, 2/νγ, 2/γ2 for NB1 and NB2 models are given in the supplementary material available at Biostatistics online.

Thus, for weighted scores equations of the form (2.3) for dependent observations, the optimal weights Wi satisfy XiTWi − 1 = ΨiTΩi − 1 = XiTΔiTΩi − 1 or Wi = ΩiΔi − 1. In the special case of independence within clusters, Δi equals Ωi and the optimal Wi is an identity matrix. In the next section, we propose a preliminary modeling step to get a good estimate for C o v(si(a)) = Ωi for dependent data.

4. ESTIMATION OF WEIGHTS USING A WORKING MODEL

This section presents an estimation method for the weights and asymptotic properties of the weighted scores estimating equations. An approach is to use a working model for the purpose of getting the weight matrices Wi, which might be near optimal for the “true joint distribution”.

We select a working model based on the discretized MVN distribution as this allows a wide range of dependence. The discretized MVN model has the following cumulative distribution function (c.d.f.): 
graphic

where Φd denote the standard MVN distribution function with correlation matrix R, Φ is the c.d.f. of the univariate standard normal, and F1,…,Fd are the univariate distributions. The MVN copula inherits the dependence structure of MVN distribution but lacks a closed-form c.d.f.; this means that likelihood inference might be difficult as d-dimensional integration is required for the multivariate probabilities (d > 3) (see Nikoloulopoulos and Karlis, 2009). However, as the optimal weight matrices Wi depend on covariances of the scores, only the bivariate marginal probabilities of Yij and Yik, jk, are needed for estimation.

The estimating equations (2.3) based on a “working” discretized MVN take the form 
graphic
(4.1)
where Wi,working1=ΔiΩi,working1=Δi(a)Ωi(a,R)1 is based on the covariance matrix of si(a) computed from the fitted discretized MVN model with estimated parameters a and R. As bivariate normal c.d.f. calculations are needed for the weight matrices (different ones for different clusters), a good approximation that can be quickly computed is important. We used the approximation given by Johnson and Kotz (1972), details of which are described in the supplementary material available at Biostatistics online. If the Wi, w o r k i n g are assumed fixed for the second stage of solving the estimating equations (4.1), then the asymptotic covariance matrix of the solution a^1 is 
graphic
(4.2)
with 
graphic
where Ωi, t r u e is the “true covariance matrix” of si(a). Alternatively, with R from the fitted MVN model assumed fixed, and using the form of the optimal Wi, we consider the following estimating equations in a: 
graphic
(4.3)

where Ωi(a,R) is the covariance matrix of si(a) based on the discretized MVN model. Solving (4.3) is numerically more intensive than (4.1) because bivariate probabilities pijk(yj,yk) = Pr(Yij = yj,Yik = yk) are needed to compute the covariance entries of Ωi(a,R) for any a in any iterative numerical method.

The asymptotics for the solution of (4.3) is similar to that of (4.1). Let aj be a component of a. Then with R fixed, we have 
graphic
(4.4)
If the univariate marginal model is correct, then E[si(a)] = 0 and only the first term on the right-hand side of (4.4) contributes to E(g2/aT). Hence, the asymptotic covariance matrix of the solution a^2 for (4.3) is 
graphic
(4.5)

with graphic.

To summarize, the steps to obtain parameter estimates and standard errors are as follows. The numerical methods used for various steps are described in the supplementary material available at Biostatistics online. 1.

  1. Use a discretized MVN model and estimate the parameters using the CL1 composite likelihood method described in Zhao and Joe (2005). Let the parameter estimates be a for the univariate parameters and R for the correlation matrix. 2.

  2. Compute the “working weight matrices” Wi,working=Ωi(a,R)Δi(a)1, where Ωi(a,R) is the covariance matrix of univariate scores based on the fitted discretized MVN model. 3.

  3. Obtain robust estimates a^ of the univariate parameters solving equation (4.1) using working weight matrices Wi, w o r k i n g or solving (4.3) using R, with a reliable nonlinear system solver. 4.

  4. The robust standard errors for a^ are obtained calculating the estimated covariance matrix V^1 of a^ based on (4.2) or V^2 based on (4.5) by plugging a^ for a and replacing Ωi, t r u e with si(a^)siT(a^). This estimate is similar to the widely used “sandwich” covariance estimator.

5. EFFICIENCY OF WEIGHTED SCORES EQUATIONS

In order to study robustness and efficiency of the weighted scores method, we will use various multivariate copula models as “true” models. A quick overview on copulas, a brief description of some parametric families of copulas that might be suitable for clustered and longitudinal count data, and their Fisher information matrix are provided in the supplementary material available at Biostatistics online.

5.1. Relative efficiency: comparison based on asymptotic variances

As a first step to understanding the asymptotic relative efficiency of the weighted scores method for negative binomial regression, we considered copula models with exchangeable and AR(1)-like dependence. For exchangeable dependence structure, the Frank copula in the Archimedean family (Joe, 1997, p 141) with Laplace transform φF(t) = − θ − 1log[1 − (1 − eθ)et] was used as the “true model”. For AR(1)-like dependence, the mixture of max-id copula (Joe and Hu, 1996) with Laplace transform φF and the bivariate Frank copula for the Cjk(m)(·;θjk) was used as the true model. For the marginals, we used both NB1 and NB2 forms of negative binomial regression.

For continuous random variables, dependence as measured by Kendall's tau τ = PcPd, the difference between the probabilities of concordance (Pc) and discordance (Pd), is associated only with the copula parameters. However, for discrete data, the marginal distributions also play a role on dependence and τ does not attain the boundary values of ±1 because the probability of ties Pt = 1 − (Pc + Pd) is positive (see Nikoloulopoulos and Karlis, 2010). Therefore, for discrete data the strength of dependence was measured using the normalized version of Kendall's tau τGK=PcPdPc+Pd given in Goodman and Kruskal (1954). Various factors related to the dimension, the strength of dependence, and overdispersion were considered in selecting parameters for the models.

A summary of the parameter choices chosen for efficiency calculations is in Table 1. For the covariates and regression parameters, we chose 3 designs: (i) p = 2,xij = (1,x1ij)T, where x1ij are taken as uniform random variables in the interval [ − 1,1], β0 = − 0.5, and β1 = 0.5; (ii) p = 3,xij = (1,x1ij,x2ij)T, where x1ij,x2ij are taken as uniform random variables in the interval [ − 1,1], β0 = − 0.5, and β1 = β2 = 0.5; and (iii) p = 3,xij = (1,x1ij,x2ij)T, where x1ij are taken as uniform random variables in the interval [ − 1,1] and x2ij are taken as Bernoulli random variables with probability of success π = 0.4, β0 = − 0.5, and β1 = β2 = 0.5.

Table 1.

Parameter choices in the computation of asymptotic relative efficiencies

Dependence strength Exchangeable (d = 2, 3, 4)
 
Longitudinal (d = 2, 3, 4)
 
Weak Moderate Strong Weak Moderate Strong 
τGK 0.2 0.5 0.8 0.2 0.5 0.8 
θ 1.1 0.1 1.35 4.5 
θjk — — — 1.8 20 
NB NB1 NB2 
Overdispersion Moderate  Large Moderate  Large 
γ 1/2  1/2  
Dependence strength Exchangeable (d = 2, 3, 4)
 
Longitudinal (d = 2, 3, 4)
 
Weak Moderate Strong Weak Moderate Strong 
τGK 0.2 0.5 0.8 0.2 0.5 0.8 
θ 1.1 0.1 1.35 4.5 
θjk — — — 1.8 20 
NB NB1 NB2 
Overdispersion Moderate  Large Moderate  Large 
γ 1/2  1/2  
Table 1.

Parameter choices in the computation of asymptotic relative efficiencies

Dependence strength Exchangeable (d = 2, 3, 4)
 
Longitudinal (d = 2, 3, 4)
 
Weak Moderate Strong Weak Moderate Strong 
τGK 0.2 0.5 0.8 0.2 0.5 0.8 
θ 1.1 0.1 1.35 4.5 
θjk — — — 1.8 20 
NB NB1 NB2 
Overdispersion Moderate  Large Moderate  Large 
γ 1/2  1/2  
Dependence strength Exchangeable (d = 2, 3, 4)
 
Longitudinal (d = 2, 3, 4)
 
Weak Moderate Strong Weak Moderate Strong 
τGK 0.2 0.5 0.8 0.2 0.5 0.8 
θ 1.1 0.1 1.35 4.5 
θjk — — — 1.8 20 
NB NB1 NB2 
Overdispersion Moderate  Large Moderate  Large 
γ 1/2  1/2  

For the above copula models and parameter and design selections, we computed the inverse of the Fisher information matrix , the matrix Vopt = (∑i = 1nΨiTΩi − 1Ψi) − 1, given in (2.4) with Ψi as in (3.1), and the matrix V2 given in (4.5) for both NB1 and NB2 regressions. Note that − 1 is the asymptotic covariance matrix of the ML estimates when the copula model is correctly specified. Since it is common practice to use parametric correlation matrices, we calculated efficiencies for structured latent correlation matrices. For exchangeable dependence, we took R as (1 − ρ)Id + ρJd, where Id is the identity matrix of order d and Jd is the d×d matrix of 1s. For AR(1) dependence, R is taken as (ρ|jk|)1 ≤ j,kd.

5.2. Findings on asymptotic relative efficiency

Representative summaries of findings on the performance of the weighted scores approach are given in Tables 2 and 3 and in the supplementary material available at Biostatistics online for three-dimensional (d = 3) copula models. We took n = 500 to get a good approximation of the asymptotic efficiency. The comparisons are made on the scaled diagonal elements, corresponding to the asymptotic variances of the univariate parameters, of the 3 matrices − 1, Vopt, and V2 with different values of ρ.

Table 2.

Asymptotic variances, scaled by n, of the marginal parameters for the Frank copula model with moderate dependence and NB2 marginals with moderate dispersion and for the weighted scores (WS) using optimal choice of weights based on the true multivariate model and working weight matrices based on the discretized MVN with exchangeable correlation matrix R with parameter ρ. Efficiencies with respect to ML are shown in parentheses

Method (covariance) nVar(β0nVar(β1nVar(γ
ML (ℐ − 11.178 (1.00) 1.804 (1.00) 4.234 (1.00) 
Optimal WS (Vopt1.206 (0.98) 1.846 (0.98) 4.339 (0.98) 
WS with discretized MVN (V2(ρ))    
    ρ = 0.0 1.210 (0.97) 2.213 (0.82) 4.346 (0.97) 
    ρ = 0.1 1.208 (0.97) 2.043 (0.88) 4.345 (0.98) 
    ρ = 0.2 1.208 (0.98) 1.938 (0.93) 4.343 (0.98) 
    ρ = 0.3 1.207 (0.98) 1.883 (0.96) 4.341 (0.98) 
    ρ = 0.4 1.208 (0.98) 1.866 (0.97) 4.340 (0.98) 
    ρ = 0.5 1.208 (0.98) 1.881 (0.96) 4.344 (0.96) 
    ρ = 0.6 1.210 (0.97) 1.923 (0.94) 4.368 (0.97) 
    ρ = 0.7 1.212 (0.97) 1.989 (0.91) 4.446 (0.95) 
    ρ = 0.8 1.218 (0.97) 2.088 (0.86) 4.704 (0.90) 
    ρ = 0.9 1.235 (0.95) 2.259 (0.80) 5.680 (0.75) 
Method (covariance) nVar(β0nVar(β1nVar(γ
ML (ℐ − 11.178 (1.00) 1.804 (1.00) 4.234 (1.00) 
Optimal WS (Vopt1.206 (0.98) 1.846 (0.98) 4.339 (0.98) 
WS with discretized MVN (V2(ρ))    
    ρ = 0.0 1.210 (0.97) 2.213 (0.82) 4.346 (0.97) 
    ρ = 0.1 1.208 (0.97) 2.043 (0.88) 4.345 (0.98) 
    ρ = 0.2 1.208 (0.98) 1.938 (0.93) 4.343 (0.98) 
    ρ = 0.3 1.207 (0.98) 1.883 (0.96) 4.341 (0.98) 
    ρ = 0.4 1.208 (0.98) 1.866 (0.97) 4.340 (0.98) 
    ρ = 0.5 1.208 (0.98) 1.881 (0.96) 4.344 (0.96) 
    ρ = 0.6 1.210 (0.97) 1.923 (0.94) 4.368 (0.97) 
    ρ = 0.7 1.212 (0.97) 1.989 (0.91) 4.446 (0.95) 
    ρ = 0.8 1.218 (0.97) 2.088 (0.86) 4.704 (0.90) 
    ρ = 0.9 1.235 (0.95) 2.259 (0.80) 5.680 (0.75) 

For this ρ value, the discretized MVN working model is close to the true model in K-L distance.

Table 2.

Asymptotic variances, scaled by n, of the marginal parameters for the Frank copula model with moderate dependence and NB2 marginals with moderate dispersion and for the weighted scores (WS) using optimal choice of weights based on the true multivariate model and working weight matrices based on the discretized MVN with exchangeable correlation matrix R with parameter ρ. Efficiencies with respect to ML are shown in parentheses

Method (covariance) nVar(β0nVar(β1nVar(γ
ML (ℐ − 11.178 (1.00) 1.804 (1.00) 4.234 (1.00) 
Optimal WS (Vopt1.206 (0.98) 1.846 (0.98) 4.339 (0.98) 
WS with discretized MVN (V2(ρ))    
    ρ = 0.0 1.210 (0.97) 2.213 (0.82) 4.346 (0.97) 
    ρ = 0.1 1.208 (0.97) 2.043 (0.88) 4.345 (0.98) 
    ρ = 0.2 1.208 (0.98) 1.938 (0.93) 4.343 (0.98) 
    ρ = 0.3 1.207 (0.98) 1.883 (0.96) 4.341 (0.98) 
    ρ = 0.4 1.208 (0.98) 1.866 (0.97) 4.340 (0.98) 
    ρ = 0.5 1.208 (0.98) 1.881 (0.96) 4.344 (0.96) 
    ρ = 0.6 1.210 (0.97) 1.923 (0.94) 4.368 (0.97) 
    ρ = 0.7 1.212 (0.97) 1.989 (0.91) 4.446 (0.95) 
    ρ = 0.8 1.218 (0.97) 2.088 (0.86) 4.704 (0.90) 
    ρ = 0.9 1.235 (0.95) 2.259 (0.80) 5.680 (0.75) 
Method (covariance) nVar(β0nVar(β1nVar(γ
ML (ℐ − 11.178 (1.00) 1.804 (1.00) 4.234 (1.00) 
Optimal WS (Vopt1.206 (0.98) 1.846 (0.98) 4.339 (0.98) 
WS with discretized MVN (V2(ρ))    
    ρ = 0.0 1.210 (0.97) 2.213 (0.82) 4.346 (0.97) 
    ρ = 0.1 1.208 (0.97) 2.043 (0.88) 4.345 (0.98) 
    ρ = 0.2 1.208 (0.98) 1.938 (0.93) 4.343 (0.98) 
    ρ = 0.3 1.207 (0.98) 1.883 (0.96) 4.341 (0.98) 
    ρ = 0.4 1.208 (0.98) 1.866 (0.97) 4.340 (0.98) 
    ρ = 0.5 1.208 (0.98) 1.881 (0.96) 4.344 (0.96) 
    ρ = 0.6 1.210 (0.97) 1.923 (0.94) 4.368 (0.97) 
    ρ = 0.7 1.212 (0.97) 1.989 (0.91) 4.446 (0.95) 
    ρ = 0.8 1.218 (0.97) 2.088 (0.86) 4.704 (0.90) 
    ρ = 0.9 1.235 (0.95) 2.259 (0.80) 5.680 (0.75) 

For this ρ value, the discretized MVN working model is close to the true model in K-L distance.

Table 3.

Asymptotic variances, scaled by n, of the marginal parameters for the trivariate mixture of max-id copula model composed by ϕF with weak dependence and NB1 marginals with large dispersion and for the weighted scores (WS) using optimal choice of weights based on the true multivariate model and working weight matrices based on the discretized MVN with AR(1) correlation matrix R with parameter ρ. Efficiencies with respect to ML are shown in parentheses

Method (covariance) nVar(β0nVar(β1nVar(β2nVar(γ
ML (ℐ − 12.481 (1.00) 2.796 (1.00) 3.520 (1.00) 17.827 (1.00) 
Optimal WS (Vopt2.467 (1.00) 2.799 (1.00) 3.526 (1.00) 17.772 (1.00) 
WS with discretized MVN (V2(ρ))     
    ρ = 0.0 2.487 (1.00) 2.856 (0.98) 3.602 (0.98) 17.777 (1.00) 
    ρ = 0.1 2.472 (1.00) 2.811 (0.99) 3.543 (0.99) 17.777 (1.00) 
    ρ = 0.2 2.469 (1.00) 2.803 (1.00) 3.531 (1.00) 17.781 (1.00) 
    ρ = 0.3 2.481 (1.00) 2.834 (0.99) 3.574 (0.99) 17.795 (1.00) 
    ρ = 0.4 2.509 (0.99) 2.908 (0.96) 3.675 (0.96) 17.833 (1.00) 
    ρ = 0.5 2.559 (0.97) 3.028 (0.92) 3.841 (0.92) 17.925 (0.99) 
    ρ = 0.6 2.635 (0.94) 3.201 (0.87) 4.079 (0.86) 18.128 (0.98) 
    ρ = 0.7 2.745 (0.90) 3.436 (0.81) 4.403 (0.80) 18.562 (0.96) 
    ρ = 0.8 2.914 (0.85) 3.759 (0.74) 4.844 (0.73) 19.517 (0.91) 
    ρ = 0.9 3.276 (0.76) 4.245 (0.66) 5.490 (0.64) 22.050 (0.81) 
Method (covariance) nVar(β0nVar(β1nVar(β2nVar(γ
ML (ℐ − 12.481 (1.00) 2.796 (1.00) 3.520 (1.00) 17.827 (1.00) 
Optimal WS (Vopt2.467 (1.00) 2.799 (1.00) 3.526 (1.00) 17.772 (1.00) 
WS with discretized MVN (V2(ρ))     
    ρ = 0.0 2.487 (1.00) 2.856 (0.98) 3.602 (0.98) 17.777 (1.00) 
    ρ = 0.1 2.472 (1.00) 2.811 (0.99) 3.543 (0.99) 17.777 (1.00) 
    ρ = 0.2 2.469 (1.00) 2.803 (1.00) 3.531 (1.00) 17.781 (1.00) 
    ρ = 0.3 2.481 (1.00) 2.834 (0.99) 3.574 (0.99) 17.795 (1.00) 
    ρ = 0.4 2.509 (0.99) 2.908 (0.96) 3.675 (0.96) 17.833 (1.00) 
    ρ = 0.5 2.559 (0.97) 3.028 (0.92) 3.841 (0.92) 17.925 (0.99) 
    ρ = 0.6 2.635 (0.94) 3.201 (0.87) 4.079 (0.86) 18.128 (0.98) 
    ρ = 0.7 2.745 (0.90) 3.436 (0.81) 4.403 (0.80) 18.562 (0.96) 
    ρ = 0.8 2.914 (0.85) 3.759 (0.74) 4.844 (0.73) 19.517 (0.91) 
    ρ = 0.9 3.276 (0.76) 4.245 (0.66) 5.490 (0.64) 22.050 (0.81) 

For this ρ value, the discretized MVN working model is close to the true model in K-L distance.

Table 3.

Asymptotic variances, scaled by n, of the marginal parameters for the trivariate mixture of max-id copula model composed by ϕF with weak dependence and NB1 marginals with large dispersion and for the weighted scores (WS) using optimal choice of weights based on the true multivariate model and working weight matrices based on the discretized MVN with AR(1) correlation matrix R with parameter ρ. Efficiencies with respect to ML are shown in parentheses

Method (covariance) nVar(β0nVar(β1nVar(β2nVar(γ
ML (ℐ − 12.481 (1.00) 2.796 (1.00) 3.520 (1.00) 17.827 (1.00) 
Optimal WS (Vopt2.467 (1.00) 2.799 (1.00) 3.526 (1.00) 17.772 (1.00) 
WS with discretized MVN (V2(ρ))     
    ρ = 0.0 2.487 (1.00) 2.856 (0.98) 3.602 (0.98) 17.777 (1.00) 
    ρ = 0.1 2.472 (1.00) 2.811 (0.99) 3.543 (0.99) 17.777 (1.00) 
    ρ = 0.2 2.469 (1.00) 2.803 (1.00) 3.531 (1.00) 17.781 (1.00) 
    ρ = 0.3 2.481 (1.00) 2.834 (0.99) 3.574 (0.99) 17.795 (1.00) 
    ρ = 0.4 2.509 (0.99) 2.908 (0.96) 3.675 (0.96) 17.833 (1.00) 
    ρ = 0.5 2.559 (0.97) 3.028 (0.92) 3.841 (0.92) 17.925 (0.99) 
    ρ = 0.6 2.635 (0.94) 3.201 (0.87) 4.079 (0.86) 18.128 (0.98) 
    ρ = 0.7 2.745 (0.90) 3.436 (0.81) 4.403 (0.80) 18.562 (0.96) 
    ρ = 0.8 2.914 (0.85) 3.759 (0.74) 4.844 (0.73) 19.517 (0.91) 
    ρ = 0.9 3.276 (0.76) 4.245 (0.66) 5.490 (0.64) 22.050 (0.81) 
Method (covariance) nVar(β0nVar(β1nVar(β2nVar(γ
ML (ℐ − 12.481 (1.00) 2.796 (1.00) 3.520 (1.00) 17.827 (1.00) 
Optimal WS (Vopt2.467 (1.00) 2.799 (1.00) 3.526 (1.00) 17.772 (1.00) 
WS with discretized MVN (V2(ρ))     
    ρ = 0.0 2.487 (1.00) 2.856 (0.98) 3.602 (0.98) 17.777 (1.00) 
    ρ = 0.1 2.472 (1.00) 2.811 (0.99) 3.543 (0.99) 17.777 (1.00) 
    ρ = 0.2 2.469 (1.00) 2.803 (1.00) 3.531 (1.00) 17.781 (1.00) 
    ρ = 0.3 2.481 (1.00) 2.834 (0.99) 3.574 (0.99) 17.795 (1.00) 
    ρ = 0.4 2.509 (0.99) 2.908 (0.96) 3.675 (0.96) 17.833 (1.00) 
    ρ = 0.5 2.559 (0.97) 3.028 (0.92) 3.841 (0.92) 17.925 (0.99) 
    ρ = 0.6 2.635 (0.94) 3.201 (0.87) 4.079 (0.86) 18.128 (0.98) 
    ρ = 0.7 2.745 (0.90) 3.436 (0.81) 4.403 (0.80) 18.562 (0.96) 
    ρ = 0.8 2.914 (0.85) 3.759 (0.74) 4.844 (0.73) 19.517 (0.91) 
    ρ = 0.9 3.276 (0.76) 4.245 (0.66) 5.490 (0.64) 22.050 (0.81) 

For this ρ value, the discretized MVN working model is close to the true model in K-L distance.

Table 2 has results for the Frank copula model with moderate dependence; see Table 1 for the values of the dependence parameters. We used design (iii) for the covariates and regression parameters, and the matrix R is taken as an exchangeable correlation matrix with parameter ρ varying from 0 to 0.9 in 0.1 increments. Table 3 has results for the mixture of max-id copula model with weak dependence composed by φ = φF and bivariate Frank copulas with the dependence parameters θ13 = 0,ω1 = ω3 = − 1,ω2 = 0 fixed, and θ, θ12, and θ23 as in Table 1. We used design (iii), large overdispersion (γ = 2), and AR(1) structured matrix R with parameter ρ ranging from 0 to 0.9 in 0.1 increments. NB1 was used for Table 3 and NB2 was used for Table 2 as the marginal model.

Conclusions from the values in the 2 tables and other computations that we have done are the following. 1.

  1. The estimating equations in (2.3) with optimal weight matrices yield estimates that are almost as good as the ML estimates. 2.

  2. When the discretized MVN is used as the working model, the weighted scores method yields highly efficient estimates when the parameter ρ is such that the discretized MVN model is quite close to the true model in Kullback–Leibler (K-L) distance. 3.

  3. For both negative binomial regressions, the efficiency of the weighted scores method using discretized MVN as a working model is high for the intercept and the dispersion parameter, for a wide range of ρ values. However, the efficiencies are not as high for the remaining regression coefficients. This indicates that incorrect choice of ρ could lead to significant loss of efficiency. 4.

  4. Efficiencies are quite flat in an interval of ρ that depends on the strength of dependence within the clusters. This provides partial justification for using (4.2) for the estimated covariance matrix of the weighted scores estimator.

5.3 Small-sample efficiency based on simulation studies

To gauge the small-sample efficiency of the weighted scores method, we performed several simulation studies using the copula models with parameter choices and design matrices as in Table 1 and Section 5.1. We report here typical results from these experiments. We randomly generated B = 104 samples of size n = 500,300,100 from the trivariate exchangeable Frank copula with moderate dependence and NB2 regression with moderate dispersion and design (i) as in Section 5.1. Table 4 contains the parameter values, the bias, variance (Var), and mean square errors of the ML estimates and weighted scores, along with the average of their theoretical variances. The theoretical variance of the ML estimate is obtained via the gradients and the Hessian computed numerically during the maximization process. The weighted scores estimates were obtained assuming that the Wi, w o r k i n g are fixed for the second stage of solving the estimating equations in (4.3). The reported theoretical variances for the weighted scores method are from V1 as in (4.2). The variances from V2 were similar. Therefore, it is adequate to use V1 for estimating the standard errors of the weighted scores estimates.

It is clear from Tables 2 and 4 (n = 500) that variances computed from the simulations are similar to the asymptotic variances for both the ML and the weighted scores method. For example, for ρ = 0.4 in Table 2, nVar(β0) is 1.208 which is approximately equal to 1.217 in Table 4. Note that for ρ = 0.4, the discretized MVN working model is close to the Frank copula model under moderate dependence with respect to the K-L distance, a measure that is useful to make theoretical likelihood comparisons between 2 models.

Table 4.

Small sample of sizes n = 500, 300, 100 simulations (104 replications) and resulted biases and mean square errors (MSE) and variances (Var), along with average theoretical variances scaled by n, for the ML of the marginal parameters for the trivariate Frank copula model with moderate dependence and NB2 marginals with moderate dispersion and for the weighted scores based on the discretized MVN with exchangeable correlation matrix R with parameter ρ as estimated by the CL1 method

  ML
 
Weighted scores
 
 n nBias nVar nMSE nV1() nBias nVar nMSE nV1() 
β0 = – 0.5 500 – 0.82 1.19 1.19 1.17 – 0.83 1.22 1.22 1.20 
 300 – 1.11 1.18 1.19 1.17 – 1.10 1.22 1.22 1.20 
 100 – 1.03 1.22 1.23 1.18 – 1.01 1.24 1.26 1.20 
β1 = 0.5 500 0.62 1.81 1.81 1.80 0.61 1.86 1.86 1.85 
 300 0.29 1.79 1.79 1.80 0.24 1.85 1.85 1.85 
 100 0.20 1.90 1.90 1.84 0.19 1.95 1.95 1.85 
γ = 0.5 500 – 0.99 4.36 4.36 4.36 – 0.74 4.46 4.46 4.42 
 300 – 0.94 4.44 4.44 4.40 – 0.71 4.55 4.55 4.42 
 100 – 0.62 4.48 4.48 4.58 – 0.42 4.61 4.61 4.43 
  ML
 
Weighted scores
 
 n nBias nVar nMSE nV1() nBias nVar nMSE nV1() 
β0 = – 0.5 500 – 0.82 1.19 1.19 1.17 – 0.83 1.22 1.22 1.20 
 300 – 1.11 1.18 1.19 1.17 – 1.10 1.22 1.22 1.20 
 100 – 1.03 1.22 1.23 1.18 – 1.01 1.24 1.26 1.20 
β1 = 0.5 500 0.62 1.81 1.81 1.80 0.61 1.86 1.86 1.85 
 300 0.29 1.79 1.79 1.80 0.24 1.85 1.85 1.85 
 100 0.20 1.90 1.90 1.84 0.19 1.95 1.95 1.85 
γ = 0.5 500 – 0.99 4.36 4.36 4.36 – 0.74 4.46 4.46 4.42 
 300 – 0.94 4.44 4.44 4.40 – 0.71 4.55 4.55 4.42 
 100 – 0.62 4.48 4.48 4.58 – 0.42 4.61 4.61 4.43 
Table 4.

Small sample of sizes n = 500, 300, 100 simulations (104 replications) and resulted biases and mean square errors (MSE) and variances (Var), along with average theoretical variances scaled by n, for the ML of the marginal parameters for the trivariate Frank copula model with moderate dependence and NB2 marginals with moderate dispersion and for the weighted scores based on the discretized MVN with exchangeable correlation matrix R with parameter ρ as estimated by the CL1 method

  ML
 
Weighted scores
 
 n nBias nVar nMSE nV1() nBias nVar nMSE nV1() 
β0 = – 0.5 500 – 0.82 1.19 1.19 1.17 – 0.83 1.22 1.22 1.20 
 300 – 1.11 1.18 1.19 1.17 – 1.10 1.22 1.22 1.20 
 100 – 1.03 1.22 1.23 1.18 – 1.01 1.24 1.26 1.20 
β1 = 0.5 500 0.62 1.81 1.81 1.80 0.61 1.86 1.86 1.85 
 300 0.29 1.79 1.79 1.80 0.24 1.85 1.85 1.85 
 100 0.20 1.90 1.90 1.84 0.19 1.95 1.95 1.85 
γ = 0.5 500 – 0.99 4.36 4.36 4.36 – 0.74 4.46 4.46 4.42 
 300 – 0.94 4.44 4.44 4.40 – 0.71 4.55 4.55 4.42 
 100 – 0.62 4.48 4.48 4.58 – 0.42 4.61 4.61 4.43 
  ML
 
Weighted scores
 
 n nBias nVar nMSE nV1() nBias nVar nMSE nV1() 
β0 = – 0.5 500 – 0.82 1.19 1.19 1.17 – 0.83 1.22 1.22 1.20 
 300 – 1.11 1.18 1.19 1.17 – 1.10 1.22 1.22 1.20 
 100 – 1.03 1.22 1.23 1.18 – 1.01 1.24 1.26 1.20 
β1 = 0.5 500 0.62 1.81 1.81 1.80 0.61 1.86 1.86 1.85 
 300 0.29 1.79 1.79 1.80 0.24 1.85 1.85 1.85 
 100 0.20 1.90 1.90 1.84 0.19 1.95 1.95 1.85 
γ = 0.5 500 – 0.99 4.36 4.36 4.36 – 0.74 4.46 4.46 4.42 
 300 – 0.94 4.44 4.44 4.40 – 0.71 4.55 4.55 4.42 
 100 – 0.62 4.48 4.48 4.58 – 0.42 4.61 4.61 4.43 

For the models of Section 5.1, the K-L distance slowly increases as the NB mean and the dependence increase and it is less than 0.1 if the NB mean is less than 4 and the dependence is less than 0.8 as measured with τGK. Our calculations show similar magnitude of K-L distance of copula models with discretized MVN for exchangeable, AR(1), and unstructured dependence. Therefore, we expect the patterns seen for exchangeable and AR(1) dependence to hold for unstructured dependence. In conclusion, the simulated variances for samples of size n = 500,300,100 show that the weighted scores method is almost as good as ML.

6. ANALYSIS OF UTILIZATION OF HEALTH CARE COUNT DATA

This section illustrates the application of the weighted scores method to Riphahn, Wambach, and Million (RWM) data (Riphahn and others, 2003), consisting of 7293 families from former West Germany and of German nationality observed from 1 to 7 times during the years 1984–1988, 1991, and 1994. The data are available for download at http://econ.queensu.ca/jae/2003-v18.4/. The focus of the original survey was to study the role of public, private, and add-on health insurance on the intensity and utilization of health care facilities. The use of health care facilities was measured by 2 primary count response variables, the number of visits to a doctor (DocVis) within the last quarter prior to the survey and the number of inpatient hospital visits (HospVis) within a given calendar year. Besides these variables the survey has a number of explanatory variables. A complete list of variables and descriptive statistics can be found in Table 1 in Greene (2008).

Our primary goal is not a complete analysis of the RWM data but to show the use of the weighted scores in regressing the count variable (DocVis) panel on the explanatory variables. For our analysis, we have selected a further subset of the RWM data consisting of 1154 families which had complete data for the 5 years 1984–1988 and then to emphasize that our method does not depend on a constant cluster size d, some observations were dropped randomly for the dependent variable. Thus, the data are unbalanced and the missing values can be assumed to be missing completely at random.

Table 5 gives the estimates of the univariate parameters, along with the dependence parameters obtained using the weighted scores method. As there is no reason to assume a structured correlation between the 5 (1984–1988) repeated measurements, in the first step of the method the unstructured correlation matrix R in the MVN copula is estimated using the CL1 method in Zhao and Joe (2005). These CL1 estimates are listed in the first rows of the table. Next, we used this estimate of R and solved the weighted scores equations (4.1) to obtain estimates a^1 of the univariate parameters for both NB1 and NB2 regressions. The parameter estimates along with robust standard errors computed using V^1 are also presented in Table 5. Since the parameters are the same for the 2 models, we can use the bivariate log-likelihood at CL1 estimates as a rough diagnostic measure for goodness of fit between the 2 NB models. This quantity was − 7835.9 for NB2 and − 7753.4 for NB1, and thus NB1 seems to be a better fit for the data.

Table 5.

Weighted scores (WS) estimates and standard errors (SEs) for the utilization of health care count data

 CL1 estimates of the dependence parameters
 
Marginal ρ12 ρ13 ρ14 ρ15 ρ23 ρ24 ρ25 ρ34 ρ35 ρ45 
NB1 0.46 0.42 0.29 0.24 0.31 0.32 0.36 0.29 0.21 0.28 
NB2 0.46 0.41 0.32 0.25 0.31 0.35 0.37 0.31 0.23 0.28 
WS estimates of the univariate parameters 
NB1 NB2   
Covariates Estimate Robust SE Z Pr > |ZEstimate Robust SE Z Pr > |Z  
Intercept 0.343 0.294 1.17 0.24 0.635 0.367 1.76 0.08   
Sex 0.373 0.091 4.10 0.00 0.248 0.107 2.28 0.02   
Age 0.018 0.004 4.14 0.00 0.018 0.005 3.55 0.00   
Hsat – 0.115 0.014 – 8.33 0.00 – 0.147 0.019 – 7.60 0.00   
Handper 0.006 0.001 5.25 0.00 0.006 0.002 3.22 0.00   
Univ – 0.439 0.305 – 1.44 0.15 – 0.707 0.306 – 2.33 0.02   
Public 0.246 0.191 1.29 0.20 0.213 0.247 0.85 0.39   
Addon – 0.089 0.247 – 0.36 0.72 – 0.371 0.228 – 1.63 0.10   
γ 3.910 0.332 11.76 0.00 1.263 0.097 13.02 0.00   
 CL1 estimates of the dependence parameters
 
Marginal ρ12 ρ13 ρ14 ρ15 ρ23 ρ24 ρ25 ρ34 ρ35 ρ45 
NB1 0.46 0.42 0.29 0.24 0.31 0.32 0.36 0.29 0.21 0.28 
NB2 0.46 0.41 0.32 0.25 0.31 0.35 0.37 0.31 0.23 0.28 
WS estimates of the univariate parameters 
NB1 NB2   
Covariates Estimate Robust SE Z Pr > |ZEstimate Robust SE Z Pr > |Z  
Intercept 0.343 0.294 1.17 0.24 0.635 0.367 1.76 0.08   
Sex 0.373 0.091 4.10 0.00 0.248 0.107 2.28 0.02   
Age 0.018 0.004 4.14 0.00 0.018 0.005 3.55 0.00   
Hsat – 0.115 0.014 – 8.33 0.00 – 0.147 0.019 – 7.60 0.00   
Handper 0.006 0.001 5.25 0.00 0.006 0.002 3.22 0.00   
Univ – 0.439 0.305 – 1.44 0.15 – 0.707 0.306 – 2.33 0.02   
Public 0.246 0.191 1.29 0.20 0.213 0.247 0.85 0.39   
Addon – 0.089 0.247 – 0.36 0.72 – 0.371 0.228 – 1.63 0.10   
γ 3.910 0.332 11.76 0.00 1.263 0.097 13.02 0.00   

Sex = indicator of female, that is, 1 for female, 0 for male; age = age in years; hsat = health satisfaction, on a scale of 0–10; handper = degree of handicap in percent, 0–100; univ = binary indicator that highest schooling degree is university; public = indicator of public health insurance; addon = indicator of add-on insurance.

Table 5.

Weighted scores (WS) estimates and standard errors (SEs) for the utilization of health care count data

 CL1 estimates of the dependence parameters
 
Marginal ρ12 ρ13 ρ14 ρ15 ρ23 ρ24 ρ25 ρ34 ρ35 ρ45 
NB1 0.46 0.42 0.29 0.24 0.31 0.32 0.36 0.29 0.21 0.28 
NB2 0.46 0.41 0.32 0.25 0.31 0.35 0.37 0.31 0.23 0.28 
WS estimates of the univariate parameters 
NB1 NB2   
Covariates Estimate Robust SE Z Pr > |ZEstimate Robust SE Z Pr > |Z  
Intercept 0.343 0.294 1.17 0.24 0.635 0.367 1.76 0.08   
Sex 0.373 0.091 4.10 0.00 0.248 0.107 2.28 0.02   
Age 0.018 0.004 4.14 0.00 0.018 0.005 3.55 0.00   
Hsat – 0.115 0.014 – 8.33 0.00 – 0.147 0.019 – 7.60 0.00   
Handper 0.006 0.001 5.25 0.00 0.006 0.002 3.22 0.00   
Univ – 0.439 0.305 – 1.44 0.15 – 0.707 0.306 – 2.33 0.02   
Public 0.246 0.191 1.29 0.20 0.213 0.247 0.85 0.39   
Addon – 0.089 0.247 – 0.36 0.72 – 0.371 0.228 – 1.63 0.10   
γ 3.910 0.332 11.76 0.00 1.263 0.097 13.02 0.00   
 CL1 estimates of the dependence parameters
 
Marginal ρ12 ρ13 ρ14 ρ15 ρ23 ρ24 ρ25 ρ34 ρ35 ρ45 
NB1 0.46 0.42 0.29 0.24 0.31 0.32 0.36 0.29 0.21 0.28 
NB2 0.46 0.41 0.32 0.25 0.31 0.35 0.37 0.31 0.23 0.28 
WS estimates of the univariate parameters 
NB1 NB2   
Covariates Estimate Robust SE Z Pr > |ZEstimate Robust SE Z Pr > |Z  
Intercept 0.343 0.294 1.17 0.24 0.635 0.367 1.76 0.08   
Sex 0.373 0.091 4.10 0.00 0.248 0.107 2.28 0.02   
Age 0.018 0.004 4.14 0.00 0.018 0.005 3.55 0.00   
Hsat – 0.115 0.014 – 8.33 0.00 – 0.147 0.019 – 7.60 0.00   
Handper 0.006 0.001 5.25 0.00 0.006 0.002 3.22 0.00   
Univ – 0.439 0.305 – 1.44 0.15 – 0.707 0.306 – 2.33 0.02   
Public 0.246 0.191 1.29 0.20 0.213 0.247 0.85 0.39   
Addon – 0.089 0.247 – 0.36 0.72 – 0.371 0.228 – 1.63 0.10   
γ 3.910 0.332 11.76 0.00 1.263 0.097 13.02 0.00   

Sex = indicator of female, that is, 1 for female, 0 for male; age = age in years; hsat = health satisfaction, on a scale of 0–10; handper = degree of handicap in percent, 0–100; univ = binary indicator that highest schooling degree is university; public = indicator of public health insurance; addon = indicator of add-on insurance.

Interestingly, the coefficient of hsat is negative and highly significant, indicating that families with high health satisfaction rating seem to be making less frequent trips to the doctors office. Further public and add-on health insurance has insignificant effect on the doctor visits as expected since families are more likely to make doctor visits based on their health care needs as opposed to the type of insurance they carry.

7. DISCUSSION

In this paper, we have studied weighted scores as an estimating equations approach based on weighting the scores of the marginal distributions to account for the dependence in repeated or clustered measurements. The weighted scores method with fixed weight matrices leads to unbiased estimating equations if the univariate model is correct, and the efficiency depends on the choice of weight matrices. On the other hand, the ML equations also lead to unbiased estimating equations if the univariate model and the dependence are modeled correctly; however, the ML estimates could be biased if the univariate model is correct but dependence is modeled incorrectly. Hence, the weighted scores method is robust to dependence if the main interest is the univariate regression parameters.

Some multivariate models are nearly indistinguishable from each other based on the moderate level of dependence in response variables and sample sizes usually seen for real data. If there is strong dependence, different copula models can be more easily discriminated and there might be a better working model than discretized MVN. In terms of the Akaike information criteria, our empirical experience is that the MVN copula model with discrete margins provides the best or nearly the best fit, so the working model based on the discretized MVN distribution to compute weight matrices can satisfactorily account for the dependence in the data.

FUNDING

Natural Sciences and Engineering Research Council Canada, Discovery Grant 8698.

Thanks to the associate editor and 2 referees for comments leading to an improved presentation. Conflict of Interest: None declared.

References

Cameron
AC
Trivedi
PK
Regression Analysis of Count Data
1998
Cambridge
Cambridge University Press
Chaganty
NR
Joe
H
Efficiency of generalized estimating equations for binary responses
Journal of the Royal Statistical Society, Series B
2004
, vol. 
66
 (pg. 
851
-
860
)
Chaganty
NR
Joe
H
Range of correlation matrices for dependent Bernoulli random variables
Biometrika
2006
, vol. 
93
 (pg. 
197
-
206
)
Chib
S
Winkelmann
R
Markov chain Monte Carlo analysis of correlated count data
Journal of Business & Economic Statistics
2001
, vol. 
19
 (pg. 
428
-
435
)
Goodman
LA
Kruskal
WH
Measures of association for cross classifications
Journal of the American Statistical Association
1954
, vol. 
49
 (pg. 
732
-
764
)
Greene
W
Functional forms for the negative binomial model for count data
Economic Letters
2008
, vol. 
99
 (pg. 
585
-
590
)
Joe
H
Multivariate Models and Dependence Concepts
1997
London
Chapman & Hall
Joe
H
Hu
T
Multivariate distributions from mixtures of max-infinitely divisible distributions
Journal of Multivariate Analysis
1996
, vol. 
57
 (pg. 
240
-
265
)
Johnson
NL
Kotz
S
Continuous Multivariate Distributions
1972
New York
Wiley
Lawless
JF
Negative binomial and mixed Poisson regression
The Canadian Journal of Statistics
1987
, vol. 
15
 (pg. 
209
-
225
)
Lee
Y
Nelder
JA
Likelihood inference for models with unobservables: another view
Statistical Science
2009
, vol. 
24
 (pg. 
255
-
269
)
Liang
KY
Zeger
SL
Longitudinal data analysis using generalized linear models
Biometrika
1986
, vol. 
73
 (pg. 
13
-
22
)
Lindsey
JK
Lambert
P
On the appropriateness of marginal models for repeated measurements in clinical trials
Statistics in Medicine
1998
, vol. 
17
 (pg. 
447
-
469
)
Nikoloulopoulos
AK
Karlis
D
Finite normal mixture copulas for multivariate discrete data modeling
Journal of Statistical Planning and Inference
2009
, vol. 
139
 (pg. 
3878
-
3890
)
Nikoloulopoulos
AK
Karlis
D
Regression in a copula model for bivariate count data
Journal of Applied Statistics
2010
, vol. 
37
 (pg. 
1555
-
1568
)
Riphahn
RT
Wambach
A
Million
A
Incentive effects in the demand for health care: a bivariate panel count data estimation
Journal of Applied Econometrics
2003
, vol. 
18
 (pg. 
387
-
405
)
Sabo
RT
Chaganty
NR
What can go wrong when ignoring correlation bounds in the use of generalized estimating equations
Statistics in Medicine
2010
, vol. 
29
 (pg. 
2501
-
2507
)
Solis-Trapala
IL
Farewell
VT
Regression analysis of overdispersed correlated count data with subject specific covariates
Statistics in Medicine
2005
, vol. 
24
 (pg. 
2557
-
2575
)
Thall
PF
Vail
SC
Some covariance models for longitudinal count data with overdispersion
Biometrics
1990
, vol. 
46
 (pg. 
657
-
671
)
Varin
C
On composite marginal likelihoods
Advances in Statistical Analysis
2008
, vol. 
92
 (pg. 
1
-
28
)
Zhao
Y
Joe
H
Composite likelihood estimation in multivariate data analysis
The Canadian Journal of Statistics
2005
, vol. 
33
 (pg. 
335
-
356
)