- Split View
-
Views
-
Cite
Cite
L. Liu, M. G. Hudgens, S. Becker-Dreps, On inverse probability-weighted estimators in the presence of interference, Biometrika, Volume 103, Issue 4, December 2016, Pages 829–842, https://doi.org/10.1093/biomet/asw047
- Share Icon Share
We consider inference about the causal effect of a treatment or exposure in the presence of interference, i.e., when one individual’s treatment affects the outcome of another individual. In the observational setting where the treatment assignment mechanism is not known, inverse probability-weighted estimators have been proposed when individuals can be partitioned into groups such that there is no interference between individuals in different groups. Unfortunately this assumption, which is sometimes referred to as partial interference, may not hold, and moreover existing weighted estimators may have large variances. In this paper we consider weighted estimators that could be employed when interference is present. We first propose a generalized inverse probability-weighted estimator and two Hájek-type stabilized weighted estimators that allow any form of interference. We derive their asymptotic distributions and propose consistent variance estimators assuming partial interference. Empirical results show that one of the Hájek estimators can have substantially smaller finite-sample variance than the other estimators. The different estimators are illustrated using data on the effects of rotavirus vaccination in Nicaragua.
1. Introduction
In causal inference it is often assumed that there is no interference between individuals, i.e., that the treatment of one individual does not affect the outcome of another. However, this assumption may not hold. For instance, in infectious disease studies, the vaccination status of one individual may affect whether another individual becomes infected (Halloran & Struchiner, 1995). Similarly, encouraging one individual to vote may increase the likelihood that another individual in the same household will vote (Nickerson, 2008). Interference may also occur between students in the same classroom (Hong & Raudenbush, 2006) or between households in the same neighbourhood (Sobel, 2006), and in myriad other contexts (Rosenbaum, 2007; Luo et al., 2012; Manski, 2013).
Inference in the presence of interference is interesting, because a treatment may have multiple types of effects, but difficult, because individuals may have many potential outcomes. Recently, methods have been developed for the setting where individuals can be partitioned into groups such that there may be interference between individuals in the same group but not between individuals in different groups; this is sometimes called partial interference (Sobel, 2006). Assuming partial interference, Hudgens & Halloran (2008) defined the direct, indirect, total and overall causal effects of a treatment in randomized studies. Inference about these types of causal effects has subsequently been considered by VanderWeele & Tchetgen Tchetgen (2011), VanderWeele et al. (2012), Halloran & Hudgens (2012), Liu & Hudgens (2014) and P. M. Aronow and C. Samii in an unpublished 2013 paper (arXiv:1305.6156), among others. For observational settings where the treatment assignment mechanism is not known, Tchetgen Tchetgen & VanderWeele (2012) proposed inverse probability-weighted estimators of these causal effects based on group-level propensity scores. These weighted estimators can be viewed as a generalization of the usual inverse probability-weighted estimator of the causal effect of a treatment in the absence of interference. However, in general, weighted estimators are known to have relatively large variance. Additionally, in some settings the partial interference assumption may be dubious. In this article we consider alternative weighted-type estimators that allow for general forms of interference and tend to be less variable.
2. Preliminaries
Consider a finite population of |$n$| individuals, and suppose that each individual may receive some treatment or exposure. Let |$Z_i$| (|$i=1,\ldots,n$|) be the random variable such that |$Z_i=1$| if individual |$i$| received treatment and |$Z_i=0$| otherwise. Suppose that interference may be present between the |$n$| individuals, and define the interference set |$\chi_i=\{i_1,i_2,\ldots\}$| for individual |$i$| to be an ordered set of all other individuals whose treatment received might affect the outcome of individual |$i$|. Assume that there is no interference between individual |$i$| and individuals not in |$\chi_i$|. There may or may not be interference between individual |$i$| and individuals in |$\chi_i$|. A central goal of the inferential methods described below is to quantify the extent to which such interference is present. Let |$S_i=(Z_{i_1},Z_{i_2},\ldots)$| denote the vector of treatment indicators for individuals that possibly interfere with individual |$i$|; that is, the outcome of individual |$i$| is allowed to depend not only on |$Z_i$| but also on |$S_i$|. For example, if the outcome of individual 1 possibly depends on their own treatment status as well as that of individuals 2 and 3 but not on that of individuals |$4,\ldots,n$|, then |$\chi_1=\{2,3\}$| and |$S_1=(Z_2,Z_3)$|. The interference sets |$\chi_1,\ldots,\chi_n$| are assumed to be known a priori. Denote possible values of |$Z_i$| and |$S_i$| by |$z_i$| and |$s_i$|. Let |$y_i(z_i,s_i)$| denote the potential outcome of individual |$i$| if they receive treatment |$z_i$| and their interference set receives |$s_i$|. This potential outcome notation is general enough to encompass any possible interference structure, of which partial interference is a special case. Let |$Y_i=y_i(Z_i, S_i)$| denote the observed outcome. The potential outcomes |$y_i(z_i,s_i)$| are assumed to be deterministic functions of |$z_i$| and |$s_i$|, and the observed outcome |$Y_i$| is considered to be random because it depends on the random variables |$Z_i$| and |$S_i$|. Let |$\sum S_i$| be the sum over all the components of |$S_i$|, and let |$|S_i|$| denote the dimension of the vector |$S_i$|. For example, if |$S_1=(Z_2,Z_3)$|, then |$\sum S_1=Z_2+Z_3$| and |$|S_1|=2$|.
In the absence of interference, a common causal estimand is the average treatment effect, which contrasts the average outcome for the counterfactual scenario where every individual in the population is treated with that of the counterfactual scenario where every individual in the population is not treated. Similarly, in the presence of interference, causal estimands can be defined in terms of counterfactual scenarios corresponding to different treatment allocation strategies (e.g., Hong & Raudenbush, 2006; Sobel, 2006; Hudgens & Halloran, 2008; Tchetgen Tchetgen & VanderWeele, 2012). For example, the indirect effect, defined formally below, contrasts average outcomes of untreated individuals for the counterfactual scenario where one allocation strategy is adopted in the population with those for the counterfactual scenario where some other allocation strategy is adopted in the population. Such estimands quantify interference, if present, at the population level and can be used to inform policy decisions regarding a treatment or exposure. The allocation strategy of interest will in general depend on the setting.
Here we consider Bernoulli allocation strategies proposed by Tchetgen Tchetgen & Vander-Weele (2012), where strategy |$\alpha$| corresponds to the counterfactual scenario in which individuals independently receive treatment with probability |$\alpha$|. It is not assumed that the observed treatment indicators |$Z_1,\ldots,Z_n$| are independent Bernoulli random variables; rather, the distribution of treatment under Bernoulli allocation is used below to define the counterfactual estimands of interest. By analogy, direct standardization of mortality rates could entail using the 2010 United States census age distribution, which may differ from the age distribution giving rise to the observed data. Corresponding to Bernoulli allocation, let |$\pi(s_i;\alpha)=\alpha^{\Sigma s_i}(1-\alpha)^{|s_i|-\Sigma s_i}$| denote the probability of the interference set for individual |$i$| receiving treatment |$s_i$| under allocation strategy |$\alpha$|. Let |$\pi(z_i;\alpha)=\alpha^{z_i}(1-\alpha)^{1-z_i}$| and |$\pi(z_i,s_i;\alpha)=\pi(z_i;\alpha)\pi(s_i;\alpha)$| denote, respectively, the probability of individual |$i$| receiving treatment |$z_i$| and the probability of individual |$i$| together with their interference set receiving joint treatment |$(z_i,s_i)$| under allocation strategy |$\alpha$|. Define |$\bar{y}_{i}(z,\alpha)=\sum_{s_i}y_{i}(z_{i}=z,s_i)\pi(s_i;\alpha)$| to be the average potential outcome of individual |$i$| under allocation strategy |$\alpha$|, where the summation is over all |$2^{|S_i|}$| possible values of |$s_i$|. Returning to the example where |$S_1=(Z_2,Z_3)$|, the average potential outcome of individual 1 is a weighted average of potential outcomes under different combinations of treatment |$Z_1=z$| and |$(Z_2, Z_3)\in\{(0,0),(0,1),(1,0),(1,1)\}$|, with the weights being the corresponding probabilities under Bernoulli allocation. Averaging over all individuals, define the population average potential outcome as |$\bar{y}(z,\alpha)=\sum_{i=1}^{n}\bar{y}_{i}(z,\alpha)/n$|. Similarly, define the marginal average potential outcome for individual |$i$| under allocation strategy |$\alpha$| by |$\bar{y}_{i}(\alpha)=\sum_{z_i,s_{i}}y_{i}(z_i,s_{i})\pi_i(z_i,s_{i};\alpha)$| and define the population marginal average potential outcome as |$\bar{y}(\alpha)=\sum_{i=1}^{n}\bar{y}_{i}(\alpha)/n$|.
Various causal effects can be defined by contrasts in the population average potential outcomes. In particular, define the direct effect of treatment under allocation strategy |$\alpha$| to be |$\overline{\mathrm{DE}}(\alpha)=g\{\bar{y}(1,\alpha),\bar{y}(0,\alpha)\}$|, where |$g(\cdot\,,\cdot)$| is some continuous contrast function. A commonly used contrast function is |$g(x_1,x_0)=x_1-x_0$|; in vaccine trials with a binary outcome it is typical to use |$g(x_1,x_0)=1-x_1/x_0$|. The direct effect compares the average potential outcomes when an individual receives treatment versus not under allocation strategy |$\alpha$|. For two allocation strategies |$\alpha_1$| and |$\alpha_0$|, let |$\overline{\mathrm{IE}}(\alpha_1,\alpha_0)=g\{\bar{y}(0,\alpha_1),\bar{y}(0,\alpha_0)\}$| be the indirect or spillover effect, which contrasts average potential outcomes when individuals do not receive treatment under different allocation strategies. In the context of vaccines, the indirect effect is sometimes referred to as herd immunity and describes the effect of the proportion of individuals vaccinated, e.g., 30% versus 50%, on the average outcome among unvaccinated individuals. An indirect effect can also be defined for when individuals receive treatment, |$z=1$|, but for simplicity we do not consider such indirect effects here. The total effect |$\overline{\mathrm{TE}}(\alpha_1,\alpha_0)=g\{\bar{y}(1,\alpha_1),\bar{y}(0,\alpha_0)\}$| incorporates both direct and indirect effects, and reflects the difference between the average potential outcomes when individuals receive treatment under one allocation strategy versus when they go without treatment under another allocation strategy. Finally, define |$\overline{\mathrm{OE}}(\alpha_1,\alpha_0)=g\{\bar{y}(\alpha_1),\bar{y}(\alpha_0)\}$| to be the overall effect, which describes the contrast in average outcomes under one allocation strategy relative to another.
3. Inverse probability-weighted and hájek-type estimators
In this section we propose inverse probability-weighted and Hájek-type estimators which allow for general interference; that is, no assumption is made regarding the structure or form of interference that might be present. When there is partial interference and the groups are of the same size, the inverse probability-weighted estimators defined below reduce to those proposed by Tchetgen Tchetgen & VanderWeele (2012). Aronow and Samii (arXiv:1305.6156) considered similar estimators in the setting where interference may be present, but where treatment is assigned randomly according to a known experimental design.
If|$f(Z_i,S_i\mid l_i,l_{\chi_i})$|is known for all|$i$|, then|$E\{\hat{Y}^{\text{ipw}}(z,\alpha)\}=\bar{y}(z,\alpha)$|and|$E\{\hat{Y}^{\text{ipw}}(\alpha)\}=\bar{y}(\alpha)$|.
Note that |$\hat{n}_{2z}$|, |$\hat{n}_1$| and |$\hat{n}_{2}$| depend on |$\alpha$|, but we suppress this dependence for notational convenience. In what follows, |$\hat{Y}^{\text{haj}}_{1}(\cdot)$| and |$\hat{Y}^{\text{haj}}_{2}(\cdot)$| will be referred to as the Hájek 1 and Hájek 2 estimators.
An appealing property of |$\hat{Y}^{\text{haj}}_2(z,\alpha)$| and |$\hat{Y}^{\text{haj}}_2(\alpha)$| is the preservation of the bounds of the potential outcome |$y_i(\cdot)$|. Specifically, suppose there exist constants |$m_l$| and |$m_u$| such that |$m_l\leq y_i(\cdot)\leq m_u$||$(i=1,\ldots,n)$|; then |$m_l\leq \hat{Y}^{\text{haj}}_2(z,\alpha)\leq m_u$| and |$m_l\leq \hat{Y}^{\text{haj}}_2(\alpha)\leq m_u$|. For example, if |$y_i(\cdot)$| is binary, then |$\hat{Y}^{\text{haj}}_2(z,\alpha), \hat{Y}^{\text{haj}}_2(\alpha)\in [0,1]$|. In contrast, preservation of the bounds is not guaranteed for |$\hat{Y}^{\text{ipw}}(\cdot)$| or |$\hat{Y}^{\text{haj}}_1(\cdot)$|.
Another attractive property of the Hájek 2 estimators is preservation of linear transformations of the outcome. In particular, suppose that the observed outcomes |$Y_i$| are transformed by the function |$\mathcal{L}(x)=ax+b$||$(a,b\in \mathbb{R})$|. Then Hájek 2 estimators computed using the transformed responses will equal |$\mathcal{L}\{\hat{Y}^{\text{haj}}_2(z,\alpha)\}$| and |$\mathcal{L}\{\hat{Y}^{\text{haj}}_2(\alpha)\}$|, where |$\hat{Y}^{\text{haj}}_2(z,\alpha)$| and |$\hat{Y}^{\text{haj}}_2(\alpha)$| are computed on the original, untransformed observed outcomes. In contrast, the inverse probability-weighted and Hájek 1 estimators have this property only when |$b=0$|.
Define |$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)=g\{\hat{Y}^{\text{ipw}}(1,\alpha),\hat{Y}^{\text{ipw}}(0,\alpha)\}$| to be the inverse probability-weighted estimator of the direct effect. Define |$\hat{\mathrm{IE}}^{\text{ipw}}(\alpha_1,\alpha_0)=g\{\hat{Y}^{\text{ipw}}(0,\alpha_1),\hat{Y}^{\text{ipw}}(0,\alpha_0)\}$|, |$\hat{\mathrm{TE}}^{\text{ipw}}(\alpha_1,\alpha_0)=g\{\hat{Y}^{\text{ipw}}(1,\alpha_1),\hat{Y}^{\text{ipw}}(0,\alpha_0)\}$| and |$\hat{\mathrm{OE}}^{\text{ipw}}(\alpha_1,\alpha_0)=g\{\hat{Y}^{\text{ipw}}(\alpha_1),\hat{Y}^{\text{ipw}}(\alpha_0)\}$| to be the weighted estimators of the indirect, total and overall effects. Hájek-type causal effect estimators are defined similarly. For example, define Hájek-type estimators of the direct effect by |$\hat{\mathrm{DE}}^{\text{haj}}_{h}(\alpha)=g\{\hat{Y}^{\text{haj}}_{h}(1,\alpha),\hat{Y}^{\text{haj}}_{h}(0,\alpha)\}$||$(h=1,2)$|. If the contrast function is |$g(x_1,x_0)=x_1-x_0$|, then by the property described in the preceding paragraph, the values of Hájek 2 causal effect estimators are invariant under location shift. This is not the case for the inverse probability-weighted and Hájek 1 causal effect estimators.
4. Asymptotic distributions
In this section the large-sample properties of the inverse probability-weighted and Hájek-type estimators are derived assuming partial interference. In particular, assume that individuals can be partitioned into groups such that there is no interference between individuals in different groups. Within groups no additional structure is assumed regarding interference, so there may be interference between any two individuals within a group. That is, we assume the following.
There exists a partition |$\{C_v\}_{v=1}^m$| of |$\{1,\ldots,n\}$| such that |$\chi_i = C_v \setminus \{i\}$| (|$i \in C_v$|; |$v=1,\ldots,m$|).
Let |$N_v=|C_v|$| denote the number of individuals in group |$v$|. Let |$Y_{vi}$| denote the observed outcome for individual |$i$| in group |$v$|, and write |$\tilde Y_v =(Y_{v1},\ldots,Y_{vN_v})$|. Let |$L_{vi}$| and |$Z_{vi}$| denote the observed covariates and treatment for individual |$i$| in group |$v$|, and define |$\tilde L_v$| and |$\tilde Z_v$| analogously to |$\tilde Y_v$|. Assume that |$N_v$| is one of the baseline covariates included in |$L_{vi}$|.
To derive the large-sample properties of the inverse probability-weighted and Hájek-type estimators, assume that the |$m$| groups are a random sample from an infinite superpopulation of groups such that the observable random variables |$(\tilde Y_v,\tilde Z_v,\tilde L_v)$||$(v=1,\ldots,m)$| are independent and identically distributed. Let |$F$| denote the distribution function of |$(\tilde Y_v,\tilde Z_v,\tilde L_v)$|.
Let |$Y_{vi}(z,s)$| denote the potential outcome for individual |$i$| in group |$v$|, where |$z$| denotes treatment received by individual |$i$| and |$s$| denotes the vector of treatment indicators for all other individuals in group |$v$|. Unlike in |$\S\S$| 2 and 3, here the potential outcomes are considered random variables because of the assumed random sampling of the |$m$| groups from a superpopulation. Denote the observed outcome for individual |$i$| by |$Y_{vi} =Y_{vi}(Z_{vi}, S_{vi})$|, where |$S_{vi}$| is the subvector of |$\tilde Z_v$| with |$Z_{vi}$| removed. Note that |$S_{vi}$| is a function of |$\tilde Z_v$|, which for notational simplicity is left implicit. Assume conditional exchangeability, i.e., |$Y_{vi}(z,s) \perp\!\!\!\!\perp \tilde Z_v \mid \tilde L_v$|, where |$X_1 \perp\!\!\!\!\perp X_2 \mid X_3$| means that |$X_1$| and |$X_2$| are independent conditional on |$X_3$|.
Let |$\mu_{z\alpha}$| be the solution to |$\int G^0_{z\alpha}(\tilde y_v,\tilde z_v,\tilde l_v;\mu_{z\alpha}) \,{\rm d} F(\tilde y_v,\tilde z_v,\tilde l_v) = 0$|. It is straightforward to show that |$\mu_{z\alpha}=k^{-1} E\{ \sum_{i=1}^{N_v} \bar Y_{vi}(z,\alpha)\}$|, where |$ k= E(N_v) $| is the mean group size in the superpopulation and |$\bar Y_{vi}(z,\alpha) = \sum_s Y_{vi}(z,s) \pi(s;\alpha)$|, with the summation being taken over all vectors |$s \in \{0,1\}^{N_v-1}$|. If |$Y_{v}^{\ast}(z,\alpha)\perp N_v$| where |$Y_{v}^{\ast}(z,\alpha)=\sum_{i=1}^{N_v} \bar Y_{vi}(z,\alpha)/N_v$|, i.e., if the average potential outcome within a group is independent of the number of individuals within the group, then |$\mu_{z\alpha}=E\{Y_{v}^{\ast}(z,\alpha)\}$|. In other words, |$\mu_{z\alpha}$| is the mean group average potential outcome in the superpopulation, analogous to |$\bar y(z,\alpha)$| defined in |$\S$| 2. Define the direct effect in the superpopulation by |$\overline{\mathrm{DE}}(\alpha) = g(\mu_{1\alpha},\mu_{0\alpha})$|; the indirect, total and overall effects in the superpopulation can be defined analogously.
It is straightforward to show that |$\mu_{z\alpha}$| also satisfies |$\int G^h_{z\alpha}(\tilde y_v,\tilde z_v,\tilde l_v;\mu_{z\alpha}) \,{\rm d} F(\tilde y_v,\tilde z_v,\tilde l_v) = 0$||$(h=1,2)$|.
The asymptotic distributions of the inverse probability-weighted and Hájek-type estimators can be derived from standard estimating equation theory (Stefanski & Boos, 2002; Perez-Heydrich et al., 2014). For example, the proposition below establishes that the three direct effect estimators are asymptotically normal and gives closed-form expressions for the asymptotic variances when the propensity scores are known. The proposition entails the vector estimating equation |$G_{\alpha}^{h}(\tilde Y_v,\tilde Z_v,\tilde L_v;\theta)=\{G_{0\alpha}^{h}(\tilde Y_v,\tilde Z_v,\tilde L_v;\theta_1),G_{1\alpha}^{h}(\tilde Y_v,\tilde Z_v,\tilde L_v;\theta_2)\}^{\mathrm{\scriptscriptstyle T}}$|, where |$\theta=(\theta_1,\theta_2)$|.
A comparison between |$\Sigma_h^{\rm D}$||$(h=1,2)$| and |$\Sigma_0^{\rm D}$| explains why the Hájek-type estimators can vary less than the inverse probability-weighted estimator. For example, suppose that the contrast |$g$| is the difference function. Denote |$G^0_{z\alpha}(\tilde Y_v,\tilde Z_v,\tilde L_v;\theta_0)$| by |$G^0_{z\alpha}$| and note that |$\Sigma_0^{\rm D}=E(G_{1\alpha}^0-G_{0\alpha}^0)^2/k^2$| and |$\Sigma_h^{\rm D}=E(G_{1\alpha}^0-G_{0\alpha}^0-W_h)^2/k^2$||$(h=1,2)$|, where |$W_h=\mu_{1\alpha}\{\hat N_{hv}(1,\alpha)-N_v\}-\mu_{0\alpha}\{\hat N_{hv}(0,\alpha)-N_v\}$| with |$\hat N_{1v}(z,\alpha)=\sum_{i=1}^{N_v}1(Z_{vi}=z)/f(Z_{vi}\mid L_{vi})$| and |$\hat N_{2v}(z,\alpha)=\sum_{i=1}^{N_v}1(Z_{vi}=z)\pi(S_{vi};\alpha)/f(\tilde Z_{v}\mid \tilde L_v)$|. Thus, the Hájek estimators will have smaller asymptotic variance if and only if |${\rm{var}}(W_h)<2E\{(G_{1\alpha}^0-G_{0\alpha}^0)W_h\}$|, and so are expected to be less variable when |$G_{1\alpha}^0-G_{0\alpha}^0$| and |$W_h$| are strongly correlated. In the extreme scenario of |$Y_{vi}(z,s) = c_z$||$(v=1,\ldots,m; \: i=1,\ldots,N_v)$|, we have |$W_2=G_{1\alpha}^0-G_{0\alpha}^0$| and |$\Sigma_2^{\rm D}=0$| but |$\Sigma_0^{\rm D}>0$| in general.
In observational studies, the mechanism by which individuals select treatment is in general not known, so that |$f(\tilde Z_v\mid \tilde L_v)$| and |$f( Z_{vi}\mid L_{vi})$| must be estimated in order to construct inverse probability-weighted estimators. In practice, due to the curse of dimensionality, one might assume a parametric model for the propensity scores (Tchetgen Tchetgen & VanderWeele, 2012). Let |$G(\tilde Z_v,\tilde L_v;\gamma)$| denote the score function for the likelihood under the assumed propensity score model indexed by a finite-dimensional parameter vector |$\gamma$|, and let |$\gamma_0$| denote the true parameter value, which is the solution to |$\int G(\tilde z_v,\tilde l_v;\gamma)\,{\rm d} F(\tilde z_v,\tilde l_v;\gamma)=0$|. Now consider the vector estimating equation |$G^{\ast h}_{\alpha}(\tilde Y_v,\tilde Z_v,\tilde L_v;\theta)=\{G^{h}_{0\alpha}(\tilde Y_v,\tilde Z_v,\tilde L_v;\theta_1),G^{h}_{1\alpha}(\tilde Y_v,\tilde Z_v,\tilde L_v;\theta_2),G(\tilde Z_l,\tilde L_l;\theta_3)\}^{\mathrm{\scriptscriptstyle T}} $||$(h=0,1,2)$| where |$\theta=(\theta_1,\theta_2,\theta_3)$|.
Proposition 3 establishes the asymptotic normality of |$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$|, |$\hat{\mathrm{DE}}^{\text{haj}}_1(\alpha)$| and |$\hat{\mathrm{DE}}^{\text{haj}}_2(\alpha)$| when the propensity score is correctly modelled. The asymptotic variance can be estimated consistently using empirical sandwich estimators, i.e., by replacing |$U_h^{\ast}$| and |$V_h^{\ast}$| with their empirical counterparts (Stefanski & Boos, 2002). In the Appendix the asymptotic variance of |$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| when the propensity score is estimated is shown to be no greater than when the propensity score is known. This is analogous to the well-known result about weighted estimators in the absence of interference; that is, even if the propensity scores are known, it is more efficient to use estimates of the propensity scores when computing inverse probability-weighted estimators. This relationship between the asymptotic variances when the propensity scores are known and when they are unknown but correctly modelled also holds for the Hájek-type estimators. Asymptotic normality of the indirect, total and overall effect estimators can be derived similarly.
5. SIMULATION STUDY
A simulation study was conducted to investigate the bias, empirical standard error and average estimated standard error of the different estimators discussed in |$\S$| 4. In the simulations the inverse probability-weighted and Hájek-type effect estimators were computed using the true propensity score, an estimated propensity score based on a correct model, and an estimated propensity score based on a misspecified model. Simulations were conducted under partial interference, i.e., Assumption 1, for both continuous and binary outcomes. The simulation study for a continuous outcome was carried out in the steps described below.
A random sample of |$m=500$| groups was created as follows. First, the group size |$N_v$| was randomly sampled from |$\{2,3,4,5,6\}$| with corresponding probabilities |$1/8, 1/8, 1/2, 3/16,1/16$|. For each individual in each group, |$\varepsilon_{vi}$| was randomly sampled from |${N}(0,1)$||$(v=1,\ldots,m;\: i=1,\ldots,N_v)$|. Then the potential outcomes for individual |$i$| in group |$v$| were set to |$Y_{vi}(z_{vi},s_{vi})=5+3z_{vi}+2\sum s_{vi}+\varepsilon_{vi}$|.
The covariate vectors |$L_{vi}=(L_{vi1},\ldots,L_{vi4})$| were randomly sampled from |$N(0,I_{4})$||$(v=1,\ldots,m;\:i=1,\ldots,N_v)$|, where |$I_{4}$| denotes the |$4 \times 4$| identity matrix.
Treatment variables |$Z_{vi}$| were simulated from a Bernoulli distribution with mean |$\mathrm{logit}^{-1}(\gamma_0+\gamma_1L_{vi1}+\gamma_2L_{vi2}+\gamma_3L_{vi3}+\gamma_4L_{vi4}+b_{v})$|, where the random effects |$b_v$| were randomly sampled from |$N(0,1)$||$(v=1,\ldots,m)$| and |$(\gamma_0,\gamma_1,\gamma_2,\gamma_3,\gamma_4)=(0{\cdot}5,-1,0{\cdot}5,-0{\cdot}25,-0{\cdot}1)$|.
A correctly specified logistic regression model |$\mathrm{logit}\{f(1\mid b,L)\}=\gamma_0+\gamma_1L_{vi1}+\gamma_2L_{vi2}+\gamma_3L_{vi3}+\gamma_4L_{vi4}+b_{v}$| and a misspecified logistic regression model |$\mathrm{logit}\{f(1\mid b,L)\}=\gamma_0+\gamma_1X_{vi1}+\gamma_2X_{vi2}+\gamma_3X_{vi3}+\gamma_4X_{vi4}+b_{v}$|, where |$X_{vi1}=\exp(L_{vi1}/2)$|, |$X_{vi2}=L_{vi2}/\{1+\exp(L_{vi1})\}+1$|, |$X_{vi3}=(L_{vi1}L_{vi3}/1{\cdot}5+0{\cdot}6)^3$| and |$X_{vi4}=(L_{vi1}+L_{vi4}+2)^2$|, were fitted to the simulated data.
The causal effect estimators and their corresponding variance estimators were calculated for |$\alpha_1=0{\cdot}1,0{\cdot}5,0{\cdot}9$| and |$\alpha_0=0{\cdot}1$| using the known propensity score, the estimated propensity score from the correctly specified mixed-effects model and the estimated propensity score from the misspecified mixed-effects model.
Steps 1–5 were repeated |$10\,000$| times, and the empirical bias, empirical standard error and average estimated standard error were calculated for the estimators in Step 5.
From the potential outcome model specified in Step 1 it follows that |$\mu_{z \alpha} =5+3z+2(\eta-1)\alpha$| and |$\mu_{\alpha} = 5+(2\eta+1)\alpha$| where |$\eta = E(N_v^2)/E(N_v)$|. Hence |$\overline{\mathrm{DE}}(\alpha)=3$| for any |$\alpha \in (0,1)$|, |$\overline{\mathrm{IE}} (\alpha_1,\alpha_0)= 2(\eta-1)(\alpha_1-\alpha_0)$|, |$\overline{\mathrm{TE}}(\alpha_1,\alpha_0)=3+2(\eta-1)(\alpha_1-\alpha_0)$| and |$\overline{\mathrm{OE}}(\alpha_1,\alpha_0)=(2\eta+1)(\alpha_1-\alpha_0)$|. Simulation results for the direct effect estimators are given in Table 1. All three estimators are approximately unbiased when the propensity scores are known or correctly modelled, but are biased if the propensity scores are incorrectly modelled. For all three estimators the average estimated standard error is also relatively close to the empirical standard error when the propensity scores are known or correctly modelled. Note that |$\hat{\mathrm{DE}}^{\text{haj}}_2(\alpha)$| has substantially smaller empirical standard error than |$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| and |$\hat{\mathrm{DE}}^{\text{haj}}_1(\alpha)$|. For example, when |$\alpha=0{\cdot}1$| and the propensity scores are known, the empirical standard errors of |$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| and |$\hat{\mathrm{DE}}^{\text{haj}}_1(\alpha)$| are 1|$\cdot$|4 and 1|$\cdot$|5, whereas the empirical standard error of |$\hat{\mathrm{DE}}^{\text{haj}}_2(\alpha)$| is only 0|$\cdot$|3. Similar results hold when the propensity scores are treated as unknown and either correctly or incorrectly modelled. The results in Table 1 demonstrate that, as well as having smaller empirical standard error, |$\hat{\mathrm{DE}}^{\text{haj}}_2(\alpha)$| may be more robust than |$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| and |$\hat{\mathrm{DE}}^{\text{haj}}_1(\alpha)$| with respect to misspecification of the propensity score model.
. | |$\alpha=0{\cdot}1$| . | . | |$\alpha=0{\cdot}5$| . | . | |$\alpha=0{\cdot}9$| . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Known |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|4 | 1|$\cdot$|4 | 1|$\cdot$|4 | 0|$\cdot$|1 | 0|$\cdot$|7 | 0|$\cdot$|7 | 0|$\cdot$|2 | 1|$\cdot$|7 | 1|$\cdot$|7 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 0|$\cdot$|6 | 1|$\cdot$|5 | 1|$\cdot$|5 | 0|$\cdot$|1 | 0|$\cdot$|6 | 0|$\cdot$|6 | 0|$\cdot$|5 | 1|$\cdot$|6 | 1|$\cdot$|6 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|0 | 0|$\cdot$|2 | 0|$\cdot$|2 | 0|$\cdot$|0 | 0|$\cdot$|3 | 0|$\cdot$|3 | ||
Correct |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 4|$\cdot$|1 | 1|$\cdot$|5 | 1|$\cdot$|4 | 0|$\cdot$|7 | 0|$\cdot$|6 | 0|$\cdot$|6 | 7|$\cdot$|3 | 1|$\cdot$|3 | 1|$\cdot$|3 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 3|$\cdot$|8 | 1|$\cdot$|5 | 1|$\cdot$|5 | 0|$\cdot$|5 | 0|$\cdot$|6 | 0|$\cdot$|8 | 7|$\cdot$|6 | 1|$\cdot$|3 | 1|$\cdot$|5 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|2 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|8 | 0|$\cdot$|2 | 0|$\cdot$|2 | 0|$\cdot$|5 | 0|$\cdot$|3 | 0|$\cdot$|3 | ||
Mis |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|4 | 1e1 | 1e1 | 20 | 1e3 | 1e3 | 10 | 2e2 | 1e2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 5|$\cdot$|2 | 2e0 | 1e1 | 10 | 1e3 | 1e3 | 30 | 3e2 | 2e2 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|8 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|5 | 0|$\cdot$|5 |
. | |$\alpha=0{\cdot}1$| . | . | |$\alpha=0{\cdot}5$| . | . | |$\alpha=0{\cdot}9$| . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Known |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|4 | 1|$\cdot$|4 | 1|$\cdot$|4 | 0|$\cdot$|1 | 0|$\cdot$|7 | 0|$\cdot$|7 | 0|$\cdot$|2 | 1|$\cdot$|7 | 1|$\cdot$|7 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 0|$\cdot$|6 | 1|$\cdot$|5 | 1|$\cdot$|5 | 0|$\cdot$|1 | 0|$\cdot$|6 | 0|$\cdot$|6 | 0|$\cdot$|5 | 1|$\cdot$|6 | 1|$\cdot$|6 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|0 | 0|$\cdot$|2 | 0|$\cdot$|2 | 0|$\cdot$|0 | 0|$\cdot$|3 | 0|$\cdot$|3 | ||
Correct |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 4|$\cdot$|1 | 1|$\cdot$|5 | 1|$\cdot$|4 | 0|$\cdot$|7 | 0|$\cdot$|6 | 0|$\cdot$|6 | 7|$\cdot$|3 | 1|$\cdot$|3 | 1|$\cdot$|3 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 3|$\cdot$|8 | 1|$\cdot$|5 | 1|$\cdot$|5 | 0|$\cdot$|5 | 0|$\cdot$|6 | 0|$\cdot$|8 | 7|$\cdot$|6 | 1|$\cdot$|3 | 1|$\cdot$|5 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|2 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|8 | 0|$\cdot$|2 | 0|$\cdot$|2 | 0|$\cdot$|5 | 0|$\cdot$|3 | 0|$\cdot$|3 | ||
Mis |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|4 | 1e1 | 1e1 | 20 | 1e3 | 1e3 | 10 | 2e2 | 1e2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 5|$\cdot$|2 | 2e0 | 1e1 | 10 | 1e3 | 1e3 | 30 | 3e2 | 2e2 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|8 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|5 | 0|$\cdot$|5 |
ESE, empirical standard error; ASE, average estimated standard error; Known |$f$|, true propensity score known; Correct |$f$|, propensity score unknown but correctly modelled; Mis |$f$|, propensity score incorrectly modelled.
. | |$\alpha=0{\cdot}1$| . | . | |$\alpha=0{\cdot}5$| . | . | |$\alpha=0{\cdot}9$| . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Known |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|4 | 1|$\cdot$|4 | 1|$\cdot$|4 | 0|$\cdot$|1 | 0|$\cdot$|7 | 0|$\cdot$|7 | 0|$\cdot$|2 | 1|$\cdot$|7 | 1|$\cdot$|7 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 0|$\cdot$|6 | 1|$\cdot$|5 | 1|$\cdot$|5 | 0|$\cdot$|1 | 0|$\cdot$|6 | 0|$\cdot$|6 | 0|$\cdot$|5 | 1|$\cdot$|6 | 1|$\cdot$|6 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|0 | 0|$\cdot$|2 | 0|$\cdot$|2 | 0|$\cdot$|0 | 0|$\cdot$|3 | 0|$\cdot$|3 | ||
Correct |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 4|$\cdot$|1 | 1|$\cdot$|5 | 1|$\cdot$|4 | 0|$\cdot$|7 | 0|$\cdot$|6 | 0|$\cdot$|6 | 7|$\cdot$|3 | 1|$\cdot$|3 | 1|$\cdot$|3 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 3|$\cdot$|8 | 1|$\cdot$|5 | 1|$\cdot$|5 | 0|$\cdot$|5 | 0|$\cdot$|6 | 0|$\cdot$|8 | 7|$\cdot$|6 | 1|$\cdot$|3 | 1|$\cdot$|5 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|2 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|8 | 0|$\cdot$|2 | 0|$\cdot$|2 | 0|$\cdot$|5 | 0|$\cdot$|3 | 0|$\cdot$|3 | ||
Mis |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|4 | 1e1 | 1e1 | 20 | 1e3 | 1e3 | 10 | 2e2 | 1e2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 5|$\cdot$|2 | 2e0 | 1e1 | 10 | 1e3 | 1e3 | 30 | 3e2 | 2e2 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|8 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|5 | 0|$\cdot$|5 |
. | |$\alpha=0{\cdot}1$| . | . | |$\alpha=0{\cdot}5$| . | . | |$\alpha=0{\cdot}9$| . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Known |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|4 | 1|$\cdot$|4 | 1|$\cdot$|4 | 0|$\cdot$|1 | 0|$\cdot$|7 | 0|$\cdot$|7 | 0|$\cdot$|2 | 1|$\cdot$|7 | 1|$\cdot$|7 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 0|$\cdot$|6 | 1|$\cdot$|5 | 1|$\cdot$|5 | 0|$\cdot$|1 | 0|$\cdot$|6 | 0|$\cdot$|6 | 0|$\cdot$|5 | 1|$\cdot$|6 | 1|$\cdot$|6 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|0 | 0|$\cdot$|2 | 0|$\cdot$|2 | 0|$\cdot$|0 | 0|$\cdot$|3 | 0|$\cdot$|3 | ||
Correct |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 4|$\cdot$|1 | 1|$\cdot$|5 | 1|$\cdot$|4 | 0|$\cdot$|7 | 0|$\cdot$|6 | 0|$\cdot$|6 | 7|$\cdot$|3 | 1|$\cdot$|3 | 1|$\cdot$|3 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 3|$\cdot$|8 | 1|$\cdot$|5 | 1|$\cdot$|5 | 0|$\cdot$|5 | 0|$\cdot$|6 | 0|$\cdot$|8 | 7|$\cdot$|6 | 1|$\cdot$|3 | 1|$\cdot$|5 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|2 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|8 | 0|$\cdot$|2 | 0|$\cdot$|2 | 0|$\cdot$|5 | 0|$\cdot$|3 | 0|$\cdot$|3 | ||
Mis |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|4 | 1e1 | 1e1 | 20 | 1e3 | 1e3 | 10 | 2e2 | 1e2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 5|$\cdot$|2 | 2e0 | 1e1 | 10 | 1e3 | 1e3 | 30 | 3e2 | 2e2 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|8 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|3 | 0|$\cdot$|5 | 0|$\cdot$|5 |
ESE, empirical standard error; ASE, average estimated standard error; Known |$f$|, true propensity score known; Correct |$f$|, propensity score unknown but correctly modelled; Mis |$f$|, propensity score incorrectly modelled.
The simulation study described above was repeated for a binary outcome. Specifically, Step 1 was replaced with the following, while all other steps remained the same.
A random sample of |$m=500$| groups was created as follows. First, the group size |$N_v$| was randomly sampled from |$\{2,3,4,5\}$| with corresponding probabilities |$1/8,1/8,1/2,1/4$|. Then the potential outcomes |$Y_{vi}(z_{vi},s_{vi})$| were set to 0 with probability 0|$\cdot$|2, 1 with probability 0|$\cdot$|2, and |$1(Z_{vi}=1,\sum S_{vi}=|s_{vi}|)$||$(v=1,\ldots,m;\: i=1,\ldots,N_v)$| with probability 0|$\cdot$|6.
For this potential outcome model, |$\mu_{1 \alpha} =0{\cdot}6\lambda+0{\cdot}2$|, |$\mu_{0 \alpha}=0{\cdot}2$| and |$\mu_{\alpha} = 0{\cdot}6\alpha\lambda+0{\cdot}2$| with |$\lambda=E(\alpha^{N_v-1}N_v)/E(N_v)$|. Simulation results for this scenario are given in Table 2. Similar to the continuous outcome simulations, the empirical standard error for |$\hat{\mathrm{DE}}^{\text{haj}}_2(\alpha)$| is smaller than that for |$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| and |$\hat{\mathrm{DE}}^{\text{haj}}_1(\alpha)$| in all three scenarios, and |$\hat{\mathrm{DE}}^{\text{haj}}_2(\alpha)$| also tends to be more robust with respect to misspecification of the propensity score model than the other two estimators. Similar results, not shown here, were observed for the other causal effect estimators.
. | |$\alpha=0{\cdot}1$| . | . | |$\alpha=0{\cdot}5$| . | . | |$\alpha=0{\cdot}9$| . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Known |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|2 | 9|$\cdot$|7 | 9|$\cdot$|7 | 0 | 4|$\cdot$|7 | 4|$\cdot$|8 | 0|$\cdot$|1 | 9|$\cdot$|3 | 9|$\cdot$|2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 0|$\cdot$|3 | 9|$\cdot$|6 | 9|$\cdot$|7 | 0 | 4|$\cdot$|6 | 4|$\cdot$|5 | 0|$\cdot$|1 | 8|$\cdot$|4 | 8|$\cdot$|4 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|0 | 6|$\cdot$|9 | 0 | 3|$\cdot$|9 | 3|$\cdot$|9 | 0 | 5|$\cdot$|4 | 5|$\cdot$|3 | ||
Correct |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 1|$\cdot$|4 | 10|$\cdot$|2 | 9|$\cdot$|8 | 0|$\cdot$|1 | 4|$\cdot$|5 | 4|$\cdot$|3 | 3|$\cdot$|5 | 7|$\cdot$|4 | 7|$\cdot$|2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 1|$\cdot$|3 | 10|$\cdot$|1 | 9|$\cdot$|8 | 0|$\cdot$|2 | 4|$\cdot$|5 | 4|$\cdot$|5 | 3|$\cdot$|5 | 7|$\cdot$|3 | 7|$\cdot$|5 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|3 | 7|$\cdot$|1 | 0|$\cdot$|5 | 3|$\cdot$|8 | 3|$\cdot$|6 | 0|$\cdot$|9 | 5|$\cdot$|2 | 5|$\cdot$|0 | ||
Mis |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 1 | 1e2 | 1e2 | 1 | 5e2 | 3e2 | 1e1 | 3e4 | 2e4 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 1 | 4e1 | 1e2 | 10 | 5e2 | 3e2 | 2e1 | 3e4 | 2e4 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|4 | 7|$\cdot$|3 | 1|$\cdot$|2 | 6|$\cdot$|8 | 6|$\cdot$|4 | 1 | 9|$\cdot$|7 | 9|$\cdot$|5 |
. | |$\alpha=0{\cdot}1$| . | . | |$\alpha=0{\cdot}5$| . | . | |$\alpha=0{\cdot}9$| . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Known |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|2 | 9|$\cdot$|7 | 9|$\cdot$|7 | 0 | 4|$\cdot$|7 | 4|$\cdot$|8 | 0|$\cdot$|1 | 9|$\cdot$|3 | 9|$\cdot$|2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 0|$\cdot$|3 | 9|$\cdot$|6 | 9|$\cdot$|7 | 0 | 4|$\cdot$|6 | 4|$\cdot$|5 | 0|$\cdot$|1 | 8|$\cdot$|4 | 8|$\cdot$|4 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|0 | 6|$\cdot$|9 | 0 | 3|$\cdot$|9 | 3|$\cdot$|9 | 0 | 5|$\cdot$|4 | 5|$\cdot$|3 | ||
Correct |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 1|$\cdot$|4 | 10|$\cdot$|2 | 9|$\cdot$|8 | 0|$\cdot$|1 | 4|$\cdot$|5 | 4|$\cdot$|3 | 3|$\cdot$|5 | 7|$\cdot$|4 | 7|$\cdot$|2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 1|$\cdot$|3 | 10|$\cdot$|1 | 9|$\cdot$|8 | 0|$\cdot$|2 | 4|$\cdot$|5 | 4|$\cdot$|5 | 3|$\cdot$|5 | 7|$\cdot$|3 | 7|$\cdot$|5 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|3 | 7|$\cdot$|1 | 0|$\cdot$|5 | 3|$\cdot$|8 | 3|$\cdot$|6 | 0|$\cdot$|9 | 5|$\cdot$|2 | 5|$\cdot$|0 | ||
Mis |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 1 | 1e2 | 1e2 | 1 | 5e2 | 3e2 | 1e1 | 3e4 | 2e4 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 1 | 4e1 | 1e2 | 10 | 5e2 | 3e2 | 2e1 | 3e4 | 2e4 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|4 | 7|$\cdot$|3 | 1|$\cdot$|2 | 6|$\cdot$|8 | 6|$\cdot$|4 | 1 | 9|$\cdot$|7 | 9|$\cdot$|5 |
. | |$\alpha=0{\cdot}1$| . | . | |$\alpha=0{\cdot}5$| . | . | |$\alpha=0{\cdot}9$| . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Known |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|2 | 9|$\cdot$|7 | 9|$\cdot$|7 | 0 | 4|$\cdot$|7 | 4|$\cdot$|8 | 0|$\cdot$|1 | 9|$\cdot$|3 | 9|$\cdot$|2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 0|$\cdot$|3 | 9|$\cdot$|6 | 9|$\cdot$|7 | 0 | 4|$\cdot$|6 | 4|$\cdot$|5 | 0|$\cdot$|1 | 8|$\cdot$|4 | 8|$\cdot$|4 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|0 | 6|$\cdot$|9 | 0 | 3|$\cdot$|9 | 3|$\cdot$|9 | 0 | 5|$\cdot$|4 | 5|$\cdot$|3 | ||
Correct |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 1|$\cdot$|4 | 10|$\cdot$|2 | 9|$\cdot$|8 | 0|$\cdot$|1 | 4|$\cdot$|5 | 4|$\cdot$|3 | 3|$\cdot$|5 | 7|$\cdot$|4 | 7|$\cdot$|2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 1|$\cdot$|3 | 10|$\cdot$|1 | 9|$\cdot$|8 | 0|$\cdot$|2 | 4|$\cdot$|5 | 4|$\cdot$|5 | 3|$\cdot$|5 | 7|$\cdot$|3 | 7|$\cdot$|5 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|3 | 7|$\cdot$|1 | 0|$\cdot$|5 | 3|$\cdot$|8 | 3|$\cdot$|6 | 0|$\cdot$|9 | 5|$\cdot$|2 | 5|$\cdot$|0 | ||
Mis |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 1 | 1e2 | 1e2 | 1 | 5e2 | 3e2 | 1e1 | 3e4 | 2e4 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 1 | 4e1 | 1e2 | 10 | 5e2 | 3e2 | 2e1 | 3e4 | 2e4 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|4 | 7|$\cdot$|3 | 1|$\cdot$|2 | 6|$\cdot$|8 | 6|$\cdot$|4 | 1 | 9|$\cdot$|7 | 9|$\cdot$|5 |
. | |$\alpha=0{\cdot}1$| . | . | |$\alpha=0{\cdot}5$| . | . | |$\alpha=0{\cdot}9$| . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Known |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 0|$\cdot$|2 | 9|$\cdot$|7 | 9|$\cdot$|7 | 0 | 4|$\cdot$|7 | 4|$\cdot$|8 | 0|$\cdot$|1 | 9|$\cdot$|3 | 9|$\cdot$|2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 0|$\cdot$|3 | 9|$\cdot$|6 | 9|$\cdot$|7 | 0 | 4|$\cdot$|6 | 4|$\cdot$|5 | 0|$\cdot$|1 | 8|$\cdot$|4 | 8|$\cdot$|4 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|0 | 6|$\cdot$|9 | 0 | 3|$\cdot$|9 | 3|$\cdot$|9 | 0 | 5|$\cdot$|4 | 5|$\cdot$|3 | ||
Correct |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 1|$\cdot$|4 | 10|$\cdot$|2 | 9|$\cdot$|8 | 0|$\cdot$|1 | 4|$\cdot$|5 | 4|$\cdot$|3 | 3|$\cdot$|5 | 7|$\cdot$|4 | 7|$\cdot$|2 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 1|$\cdot$|3 | 10|$\cdot$|1 | 9|$\cdot$|8 | 0|$\cdot$|2 | 4|$\cdot$|5 | 4|$\cdot$|5 | 3|$\cdot$|5 | 7|$\cdot$|3 | 7|$\cdot$|5 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|3 | 7|$\cdot$|1 | 0|$\cdot$|5 | 3|$\cdot$|8 | 3|$\cdot$|6 | 0|$\cdot$|9 | 5|$\cdot$|2 | 5|$\cdot$|0 | ||
Mis |$f$| | Bias | ESE | ASE | Bias | ESE | ASE | Bias | ESE | ASE | ||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | 1 | 1e2 | 1e2 | 1 | 5e2 | 3e2 | 1e1 | 3e4 | 2e4 | ||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | 1 | 4e1 | 1e2 | 10 | 5e2 | 3e2 | 2e1 | 3e4 | 2e4 | ||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | 0|$\cdot$|1 | 7|$\cdot$|4 | 7|$\cdot$|3 | 1|$\cdot$|2 | 6|$\cdot$|8 | 6|$\cdot$|4 | 1 | 9|$\cdot$|7 | 9|$\cdot$|5 |
6. Rotavirus vaccine study in nicaragua
Rotavirus diarrhoea is a major health problem in Nicaragua (Espinoza et al., 1997). The pentavalent rotavirus vaccine was introduced in 2006. Nicaraguan infants are offered the vaccine at two, four and six months of age as part of the country’s Expanded Program on Immunization. In 2010, a study to assess the impact of the immunization programme was carried out in León, Nicaragua’s second largest city, with an estimated population in 2010 of close to 200 000. The Health and Demographic Surveillance Site-León was employed to obtain a simple random sample of households from 50 out of 208 randomly selected geographical clusters of equal size in León (Becker-Dreps et al., 2013). For simplicity, in the following analysis the cluster sampling used to obtain these data is ignored. There were 530 households in the study, and any child in a selected household under the age of five was eligible to participate. Information was collected about each household, including water source, sanitation system, maternal education level, and the dates of birth of study participants. Each individual in the study was visited fortnightly by a fieldworker for approximately one year. At each visit information about diarrhoea episodes in the past 14 days was recorded. The primary outcome |$Y$| was whether a child had at least one diarrhoea episode during the study.
For each child we assumed their interference set to be other children in the same household. A mixed-effects logistic regression model of the probability of having received all three scheduled doses was fitted conditional on the following baseline covariates: child’s age, categorized as 0–11 months, 12–23 months, or 24–59 months; mother’s education level, categorized as primary education only or at least some secondary education; dirt household floor or not; dry or wet season; household indoor toilet, latrine, or none; indoor municipal water supply or not; and breastfeeding or not. Likelihood ratio tests from the fitted logistic model indicated that the odds of having all three doses of vaccine was higher among children whose mothers were more educated, with |$p=0{\cdot}01$|.
Effect estimates and estimated standard errors are reported in Table 3 for the inverse probability-weighted and the two Hájek estimators for contrast function |$g(x_1,x_0) = x_1-x_0$|. The Hájek 2 estimates are closer to the null value of zero and, as expected, have 15–20% smaller estimated standard errors than the inverse probability-weighted and Hájek 1 estimates. The direct effect estimates indicate the expected difference in the proportions of children who will acquire rotavirus diarrhoea among vaccinated versus unvaccinated children for a fixed level of vaccine coverage |$\alpha$|. The estimated direct effects become closer to the null as |$\alpha$| increases, suggesting that the direct protective effect of vaccination decreases as additional children in the household are vaccinated. The indirect effect estimates approximate the expected difference in the proportions of unvaccinated children who will acquire diarrhoea when vaccine coverage is |$\alpha$|% versus 10%. The total effect estimates indicate the expected difference in the proportions of vaccinated children who will acquire diarrhoea when vaccine coverage is |$\alpha$|% compared with unvaccinated children when vaccine coverage is 10%. The overall effect estimates provide simple summary comparisons between any two allocation strategies; for example, according to the Hájek 2 estimates, 5|$\cdot$|1 fewer cases of diarrhoea per 100 individuals per year would be expected if on average 80% of children in a household were vaccinated than if on average only 10% of children were vaccinated.
. | |$\alpha=0{\cdot}2$| . | |$\alpha=0{\cdot}4$| . | |$\alpha=0{\cdot}6$| . | |$\alpha=0{\cdot}8$| . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Est | SE | Est | SE | Est | SE | Est | SE | |||||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | |$-0{\cdot}79$| | |$1{\cdot}30$| | |$-0{\cdot}62$| | |$0{\cdot}97$| | |$-0{\cdot}42$| | |$0{\cdot}67$| | |$-0{\cdot}18$| | |$0{\cdot}44$| | ||||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | |$-0{\cdot}81$| | |$1{\cdot}31$| | |$-0{\cdot}63$| | |$0{\cdot}98$| | |$-0{\cdot}43$| | |$0{\cdot}68$| | |$-0{\cdot}20$| | |$0{\cdot}45$| | ||||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | |$-0{\cdot}52$| | |$1{\cdot}11$| | |$-0{\cdot}44$| | |$0{\cdot}84$| | |$-0{\cdot}32$| | |$0{\cdot}59$| | |$-0{\cdot}14$| | |$0{\cdot}41$| | ||||
|$\hat{\mathrm{IE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}13$| | |$0{\cdot}17$| | |$-0{\cdot}40$| | |$0{\cdot}50$| | |$-0{\cdot}69$| | |$0{\cdot}84$| | |$-0{\cdot}98$| | |$1{\cdot}18$| | ||||
|$\hat{\mathrm{IE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}13$| | |$0{\cdot}17$| | |$-0{\cdot}41$| | |$0{\cdot}50$| | |$-0{\cdot}69$| | |$0{\cdot}84$| | |$ -0{\cdot}99$| | |$1{\cdot}18$| | ||||
|$\hat{\mathrm{IE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}04$| | |$0{\cdot}16$| | |$-0{\cdot}15$| | |$0{\cdot}45$| | |$-0{\cdot}28$| | |$0{\cdot}73$| | |$-0{\cdot}45$| | |$1{\cdot}01$| | ||||
|$\hat{\mathrm{TE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}93$| | |$1{\cdot}46$| | |$-1{\cdot}02$| | |$1{\cdot}45$| | |$ -1{\cdot}11$| | |$1{\cdot}45$| | |$-1{\cdot}17$| | |$ 1{\cdot}45$| | ||||
|$\hat{\mathrm{TE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}94$| | |$1{\cdot}47$| | |$-1{\cdot}04$| | |$1{\cdot}46$| | |$-1{\cdot}12$| | |$1{\cdot}46$| | |$-1{\cdot}18$| | |$1{\cdot}46$| | ||||
|$\hat{\mathrm{TE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}56$| | |$1{\cdot}25$| | |$-0{\cdot}59$| | |$1{\cdot}24$| | |$-0{\cdot}61$| | |$1{\cdot}25$| | |$-0{\cdot}59$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}20$| | |$0{\cdot}28$| | |$-0{\cdot}56$| | |$0{\cdot}73$| | |$-0{\cdot}85$| | |$1{\cdot}05$| | |$-1{\cdot}04$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}20$| | |$0{\cdot}28$| | |$-0{\cdot}57$| | |$0{\cdot}73$| | |$-0{\cdot}86$| | |$1{\cdot}06$| | |$-1{\cdot}05$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}09$| | |$0{\cdot}24$| | |$-0{\cdot}27$| | |$0{\cdot}63$| | |$-0{\cdot}42$| | |$0{\cdot}91$| | |$-0{\cdot}51 $| | |$1{\cdot}08$| |
. | |$\alpha=0{\cdot}2$| . | |$\alpha=0{\cdot}4$| . | |$\alpha=0{\cdot}6$| . | |$\alpha=0{\cdot}8$| . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Est | SE | Est | SE | Est | SE | Est | SE | |||||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | |$-0{\cdot}79$| | |$1{\cdot}30$| | |$-0{\cdot}62$| | |$0{\cdot}97$| | |$-0{\cdot}42$| | |$0{\cdot}67$| | |$-0{\cdot}18$| | |$0{\cdot}44$| | ||||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | |$-0{\cdot}81$| | |$1{\cdot}31$| | |$-0{\cdot}63$| | |$0{\cdot}98$| | |$-0{\cdot}43$| | |$0{\cdot}68$| | |$-0{\cdot}20$| | |$0{\cdot}45$| | ||||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | |$-0{\cdot}52$| | |$1{\cdot}11$| | |$-0{\cdot}44$| | |$0{\cdot}84$| | |$-0{\cdot}32$| | |$0{\cdot}59$| | |$-0{\cdot}14$| | |$0{\cdot}41$| | ||||
|$\hat{\mathrm{IE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}13$| | |$0{\cdot}17$| | |$-0{\cdot}40$| | |$0{\cdot}50$| | |$-0{\cdot}69$| | |$0{\cdot}84$| | |$-0{\cdot}98$| | |$1{\cdot}18$| | ||||
|$\hat{\mathrm{IE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}13$| | |$0{\cdot}17$| | |$-0{\cdot}41$| | |$0{\cdot}50$| | |$-0{\cdot}69$| | |$0{\cdot}84$| | |$ -0{\cdot}99$| | |$1{\cdot}18$| | ||||
|$\hat{\mathrm{IE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}04$| | |$0{\cdot}16$| | |$-0{\cdot}15$| | |$0{\cdot}45$| | |$-0{\cdot}28$| | |$0{\cdot}73$| | |$-0{\cdot}45$| | |$1{\cdot}01$| | ||||
|$\hat{\mathrm{TE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}93$| | |$1{\cdot}46$| | |$-1{\cdot}02$| | |$1{\cdot}45$| | |$ -1{\cdot}11$| | |$1{\cdot}45$| | |$-1{\cdot}17$| | |$ 1{\cdot}45$| | ||||
|$\hat{\mathrm{TE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}94$| | |$1{\cdot}47$| | |$-1{\cdot}04$| | |$1{\cdot}46$| | |$-1{\cdot}12$| | |$1{\cdot}46$| | |$-1{\cdot}18$| | |$1{\cdot}46$| | ||||
|$\hat{\mathrm{TE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}56$| | |$1{\cdot}25$| | |$-0{\cdot}59$| | |$1{\cdot}24$| | |$-0{\cdot}61$| | |$1{\cdot}25$| | |$-0{\cdot}59$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}20$| | |$0{\cdot}28$| | |$-0{\cdot}56$| | |$0{\cdot}73$| | |$-0{\cdot}85$| | |$1{\cdot}05$| | |$-1{\cdot}04$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}20$| | |$0{\cdot}28$| | |$-0{\cdot}57$| | |$0{\cdot}73$| | |$-0{\cdot}86$| | |$1{\cdot}06$| | |$-1{\cdot}05$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}09$| | |$0{\cdot}24$| | |$-0{\cdot}27$| | |$0{\cdot}63$| | |$-0{\cdot}42$| | |$0{\cdot}91$| | |$-0{\cdot}51 $| | |$1{\cdot}08$| |
Est, point estimate; SE, estimated standard error.
. | |$\alpha=0{\cdot}2$| . | |$\alpha=0{\cdot}4$| . | |$\alpha=0{\cdot}6$| . | |$\alpha=0{\cdot}8$| . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Est | SE | Est | SE | Est | SE | Est | SE | |||||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | |$-0{\cdot}79$| | |$1{\cdot}30$| | |$-0{\cdot}62$| | |$0{\cdot}97$| | |$-0{\cdot}42$| | |$0{\cdot}67$| | |$-0{\cdot}18$| | |$0{\cdot}44$| | ||||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | |$-0{\cdot}81$| | |$1{\cdot}31$| | |$-0{\cdot}63$| | |$0{\cdot}98$| | |$-0{\cdot}43$| | |$0{\cdot}68$| | |$-0{\cdot}20$| | |$0{\cdot}45$| | ||||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | |$-0{\cdot}52$| | |$1{\cdot}11$| | |$-0{\cdot}44$| | |$0{\cdot}84$| | |$-0{\cdot}32$| | |$0{\cdot}59$| | |$-0{\cdot}14$| | |$0{\cdot}41$| | ||||
|$\hat{\mathrm{IE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}13$| | |$0{\cdot}17$| | |$-0{\cdot}40$| | |$0{\cdot}50$| | |$-0{\cdot}69$| | |$0{\cdot}84$| | |$-0{\cdot}98$| | |$1{\cdot}18$| | ||||
|$\hat{\mathrm{IE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}13$| | |$0{\cdot}17$| | |$-0{\cdot}41$| | |$0{\cdot}50$| | |$-0{\cdot}69$| | |$0{\cdot}84$| | |$ -0{\cdot}99$| | |$1{\cdot}18$| | ||||
|$\hat{\mathrm{IE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}04$| | |$0{\cdot}16$| | |$-0{\cdot}15$| | |$0{\cdot}45$| | |$-0{\cdot}28$| | |$0{\cdot}73$| | |$-0{\cdot}45$| | |$1{\cdot}01$| | ||||
|$\hat{\mathrm{TE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}93$| | |$1{\cdot}46$| | |$-1{\cdot}02$| | |$1{\cdot}45$| | |$ -1{\cdot}11$| | |$1{\cdot}45$| | |$-1{\cdot}17$| | |$ 1{\cdot}45$| | ||||
|$\hat{\mathrm{TE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}94$| | |$1{\cdot}47$| | |$-1{\cdot}04$| | |$1{\cdot}46$| | |$-1{\cdot}12$| | |$1{\cdot}46$| | |$-1{\cdot}18$| | |$1{\cdot}46$| | ||||
|$\hat{\mathrm{TE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}56$| | |$1{\cdot}25$| | |$-0{\cdot}59$| | |$1{\cdot}24$| | |$-0{\cdot}61$| | |$1{\cdot}25$| | |$-0{\cdot}59$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}20$| | |$0{\cdot}28$| | |$-0{\cdot}56$| | |$0{\cdot}73$| | |$-0{\cdot}85$| | |$1{\cdot}05$| | |$-1{\cdot}04$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}20$| | |$0{\cdot}28$| | |$-0{\cdot}57$| | |$0{\cdot}73$| | |$-0{\cdot}86$| | |$1{\cdot}06$| | |$-1{\cdot}05$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}09$| | |$0{\cdot}24$| | |$-0{\cdot}27$| | |$0{\cdot}63$| | |$-0{\cdot}42$| | |$0{\cdot}91$| | |$-0{\cdot}51 $| | |$1{\cdot}08$| |
. | |$\alpha=0{\cdot}2$| . | |$\alpha=0{\cdot}4$| . | |$\alpha=0{\cdot}6$| . | |$\alpha=0{\cdot}8$| . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Est | SE | Est | SE | Est | SE | Est | SE | |||||
|$\hat{\mathrm{DE}}^{\text{ipw}}(\alpha)$| | |$-0{\cdot}79$| | |$1{\cdot}30$| | |$-0{\cdot}62$| | |$0{\cdot}97$| | |$-0{\cdot}42$| | |$0{\cdot}67$| | |$-0{\cdot}18$| | |$0{\cdot}44$| | ||||
|$\hat{\mathrm{DE}}_1^{\text{haj}}(\alpha)$| | |$-0{\cdot}81$| | |$1{\cdot}31$| | |$-0{\cdot}63$| | |$0{\cdot}98$| | |$-0{\cdot}43$| | |$0{\cdot}68$| | |$-0{\cdot}20$| | |$0{\cdot}45$| | ||||
|$\hat{\mathrm{DE}}_2^{\text{haj}}(\alpha)$| | |$-0{\cdot}52$| | |$1{\cdot}11$| | |$-0{\cdot}44$| | |$0{\cdot}84$| | |$-0{\cdot}32$| | |$0{\cdot}59$| | |$-0{\cdot}14$| | |$0{\cdot}41$| | ||||
|$\hat{\mathrm{IE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}13$| | |$0{\cdot}17$| | |$-0{\cdot}40$| | |$0{\cdot}50$| | |$-0{\cdot}69$| | |$0{\cdot}84$| | |$-0{\cdot}98$| | |$1{\cdot}18$| | ||||
|$\hat{\mathrm{IE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}13$| | |$0{\cdot}17$| | |$-0{\cdot}41$| | |$0{\cdot}50$| | |$-0{\cdot}69$| | |$0{\cdot}84$| | |$ -0{\cdot}99$| | |$1{\cdot}18$| | ||||
|$\hat{\mathrm{IE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}04$| | |$0{\cdot}16$| | |$-0{\cdot}15$| | |$0{\cdot}45$| | |$-0{\cdot}28$| | |$0{\cdot}73$| | |$-0{\cdot}45$| | |$1{\cdot}01$| | ||||
|$\hat{\mathrm{TE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}93$| | |$1{\cdot}46$| | |$-1{\cdot}02$| | |$1{\cdot}45$| | |$ -1{\cdot}11$| | |$1{\cdot}45$| | |$-1{\cdot}17$| | |$ 1{\cdot}45$| | ||||
|$\hat{\mathrm{TE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}94$| | |$1{\cdot}47$| | |$-1{\cdot}04$| | |$1{\cdot}46$| | |$-1{\cdot}12$| | |$1{\cdot}46$| | |$-1{\cdot}18$| | |$1{\cdot}46$| | ||||
|$\hat{\mathrm{TE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}56$| | |$1{\cdot}25$| | |$-0{\cdot}59$| | |$1{\cdot}24$| | |$-0{\cdot}61$| | |$1{\cdot}25$| | |$-0{\cdot}59$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}^{\text{ipw}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}20$| | |$0{\cdot}28$| | |$-0{\cdot}56$| | |$0{\cdot}73$| | |$-0{\cdot}85$| | |$1{\cdot}05$| | |$-1{\cdot}04$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}_1^{\text{haj}}(\alpha,0{\cdot}1)$| | |$-0{\cdot}20$| | |$0{\cdot}28$| | |$-0{\cdot}57$| | |$0{\cdot}73$| | |$-0{\cdot}86$| | |$1{\cdot}06$| | |$-1{\cdot}05$| | |$1{\cdot}25$| | ||||
|$\hat{\mathrm{OE}}_2^{\text{haj}}(\alpha,0{\cdot}1)$| | |$ -0{\cdot}09$| | |$0{\cdot}24$| | |$-0{\cdot}27$| | |$0{\cdot}63$| | |$-0{\cdot}42$| | |$0{\cdot}91$| | |$-0{\cdot}51 $| | |$1{\cdot}08$| |
Est, point estimate; SE, estimated standard error.
7. Discussion
The inverse probability-weighted estimator and two Hájek-type estimators in |$\S$| 3 allow for any form of interference between individuals, with the former being unbiased in a finite-population model with known propensity scores. Assuming partial interference and random sampling of groups from a superpopulation, all three estimators are consistent and asymptotically normal when the propensity scores are known or correctly modelled. Empirical results demonstrate that the second Hájek estimator can have substantially smaller finite-sample variance than the other two estimators. One avenue of future research entails deriving the estimators’ large-sample properties without assuming partial interference. Another future direction might involve developing estimators which are robust with respect to misspecification of the propensity score model. Throughout this work conditional exchangeability is assumed, i.e., treatment is assumed to be independent of potential outcomes conditional on an observable set of covariates. In future work one could investigate relaxing this assumption, perhaps via sensitivity analysis or instrumental variable methods. Finally, the target parameters in this paper utilize the Bernoulli allocation strategy proposed by Tchetgen Tchetgen & VanderWeele (2012). These estimands consider the counterfactual scenario where individuals independently select treatment with equal probability. In scenarios where interference is present, it is unlikely that individual treatment selections would be independent. Therefore further interference-related research might target alternative parameters.
Acknowledgement
The authors were partially supported by the U.S. National Institutes of Health. The fieldwork was supported by the Thrasher Research Fund. The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors thank M. Elizabeth Halloran, Joseph Rigdon, an associate editor, and a reviewer for helpful comments.
Appendix
Proof of Proposition 1
That |$\hat{Y}^{\text{ipw}}(\alpha)$| is unbiased can be proved similarly.
Proof of Propositions 2 and 3
To prove Proposition 2, assume that there exist constants |$c_1,c_2,c_3<\infty$| and |$\delta>0$| such that |$-c_1 < Y_{vi}<c_2$|, |$N_v<c_3$|, |$\delta< f(\tilde Z_v\mid \tilde L_v)$| and |$\delta< f(Z_{vi}\mid L_{vi})$| with probability 1. Let |$\hat\theta^0=\{\hat Y^{\text{ipw}}(0,\alpha),\hat Y^{\text{ipw}}(1,\alpha)\}$| and |$\hat\theta^h=\{\hat Y^{\text{haj}}_h(0,\alpha),\hat Y^{\text{haj}}_h(1,\alpha)\}$| (|$h=1,2$|). Let |$G^h(\theta)=\{G^h_0(\theta),G^h_1(\theta)\}^{\mathrm{\scriptscriptstyle T}} $| denote the vector estimating equation |$G_{\alpha}^{h}(\tilde Y_v, \tilde Z_v, \tilde L_v; \theta)$|. Let |$\dot G^h(\theta_0)=\partial G^h(\theta_0)/\partial \theta^{\mathrm{\scriptscriptstyle T}} _0$| and write |$\|v\|^2=v_1^2+ \cdots+v_p^2$| for any vector |$v$| of length |$p$|.
From the boundedness assumptions on |$Y_{vi}$|, |$N_v$| and |$f(\tilde Z_v\mid \tilde L_v)$|, it follows that |$E\|G^{0}(\theta_0)\|^2<\infty$|. Similar results can be established for |$h=1,2$|.
Next, note that |$G^0_{z \alpha}(\tilde Y_v, \tilde Z_v, \tilde L_v; \mu_{z\alpha})$| is a linear function of |$\mu_{z\alpha}$| with slope |$-N_v$|. For |$h=1,2$|, |$G^h_{z \alpha}(\tilde Y_v, \tilde Z_v, \tilde L_v; \mu_{z\alpha})$| is also a linear function of |$\mu_{z\alpha}$| with finite, nonzero slope, because by assumption |$f(\tilde Z_v \mid \tilde L_v)>0$| and there exists at least one |$i$| such that |$Z_{vi}=z$|. Hence, the solution for |$\theta$| to |$\sum_{v=1}^m G^{h}(\theta)=0$| is unique for |$h=0,1,2$|. Therefore, because (i)–(iv) hold, by Theorem 5.4.2 of van der Vaart (1998), |$\hat \theta^h$| converges in probability to |$\theta_0$|. Proposition 2 then follows from Theorem 5.4.1 of van der Vaart (1998) and the delta method.
Similar reasoning can be used to prove Proposition 3 under the following additional assumptions about the parametric propensity score model: |$\gamma_0$| is in an open subset of Euclidean space; |$E\{\dot G(\gamma_0)\}$| exists and is nonsingular, where |$\dot G(\gamma_0) = \partial G(\tilde Z_v,\tilde L_v;\gamma_0)/\partial \gamma_0^{\mathrm{\scriptscriptstyle T}} $|; |$\:G(\tilde z_v,\tilde l_v;\gamma)$| is twice continuously differentiable with respect to |$\gamma$| and |$|\partial^2 G(\tilde z_v,\tilde l_v;\gamma)/(\partial \gamma_i \partial \gamma_j)|\leq \psi $| for some integrable measurable function |$\psi$| for every |$(\tilde z_v,\tilde l_v)$|; and |$E\|G(\tilde Z_v,\tilde L_v;\gamma_0)\|^2<\infty$|.
Proof of reduction in variance with a correctly specified propensity score model
Since |$V_{\gamma} = E\{G(\tilde Z_v,\tilde L_v;\gamma_0)^{\otimes 2}\}$| is positive semidefinite, so is |$V_{\gamma}^{-1}$|. Therefore |$\Sigma_0^{\rm D} \geq \Sigma_0^{\ast {\rm D}}$|. The same approach can be used to show that |$\Sigma_h^{\rm D} \geq \Sigma_h^{\ast {\rm D}}$| for |$h=1,2$|.
References