-
PDF
- Split View
-
Views
-
Cite
Cite
Sara Sauer, Bethany Hedt-Gauthier, Claudia Rivera-Rodriguez, Sebastien Haneuse, Small-Sample Inference for Cluster-Based Outcome-Dependent Sampling Schemes in Resource-Limited Settings: Investigating Low Birthweight in Rwanda, Biometrics, Volume 78, Issue 2, June 2022, Pages 701–715, https://doi.org/10.1111/biom.13423
- Share Icon Share
Abstract
The neonatal mortality rate in Rwanda remains above the United Nations Sustainable Development Goal 3 target of 12 deaths per 1000 live births. As part of a larger effort to reduce preventable neonatal deaths in the country, we conducted a study to examine risk factors for low birthweight. The data were collected via a cost-efficient cluster-based outcome-dependent sampling (ODS) scheme wherein clusters of individuals (health centers) were selected on the basis of, in part, the outcome rate of the individuals. For a given data set collected via a cluster-based ODS scheme, estimation for a marginal model may proceed via inverse-probability-weighted generalized estimating equations, where the cluster-specific weights are the inverse probability of the health center's inclusion in the sample. In this paper, we provide a detailed treatment of the asymptotic properties of this estimator, together with an explicit expression for the asymptotic variance and a corresponding estimator. Furthermore, motivated by the study we conducted in Rwanda, we propose a number of small-sample bias corrections to both the point estimates and the standard error estimates. Through simulation, we show that applying these corrections when the number of clusters is small generally reduces the bias in the point estimates, and results in closer to nominal coverage. The proposed methods are applied to data from 18 health centers and 1 district hospital in Rwanda.
1 Introduction
Neonatal mortality is defined as the probability of dying within the first 28 days of life. While the majority of neonatal deaths are preventable, due to slower progress in neonatal mortality reduction since 1990 compared to the advances made in child mortality reduction, neonatal deaths represented 47% of all under-five deaths in 2018 (WHO, 2019). The United Nations Sustainable Development Goal 3 aims to reduce neonatal mortality to 12 deaths per 1000 live births by 2030 (Hug et al., 2019). In Rwanda, however, the rate of neonatal deaths remains high at 20 deaths per 1,000 live deaths (Rwanda DHS, 2015).
Low birthweight is associated with a higher risk of neonatal death (Lubchenco et al., 1972), strategies that target reduction of low birthweight births may therefore help to reduce neonatal mortality. This motivated a study we recently conducted in Rwanda, with the goal to investigate risk factors for low birthweight in two northern districts of the country. Due to time and resource constraints, it was feasible to visit only 18 of the 44 health centers for collection of the individual-level data. The 18 health centers were sampled based on an outcome-dependent sampling (ODS) design, and in the summer of 2017, the paper's first author traveled to the sampled health centers and collected data on all live births recorded in the maternity registers between April and June 2017.
ODS is an indispensable tool for conducting research in resource-limited settings. Examples of ODS designs include the case-control, nested case-control, case-cohort, and two-phase designs (Prentice and Pyke, 1979; Breslow and Day, 1980; White, 1982; Langholz and Thomas, 1990; Wacholder, 1991). For the most part, the literature on ODS has focused on the setting where individuals are treated as independent, although methods have been recently proposed for longitudinal and cluster-correlated data settings (Neuhaus et al., 2002, 2006, 2014; Schildcrout and Rathouz, 2010; Schildcrout et al., 2013; Haneuse and Rivera-Rodriguez, 2018; Rivera-Rodriguez et al., 2019). However, in both the independent and correlated data settings, the majority of designs proposed involve sampling at the level of the individual (i.e., the study units within a cluster). In some settings, researchers may opt to use readily-available outcome information (either aggregated or at the level of the individual) to perform cluster-based sampling; that is, to sample clusters rather than individuals (Cai et al., 2001). Such sampling may be preferred when, for example, the costs associated with travel to a clinic are high compared to the cost of collecting information on individuals once at the clinic.
For data collected through a cluster-based ODS design, Cai et al. (2001) proposed to carry out estimation for a marginally specified regression model via inverse-probability-weighted generalized estimating equations (IPW-GEE), with the weights taken to be the inverse of the cluster-specific probabilities of being selected by the scheme. Furthermore, they proposed that inference be based on a sandwich estimator for the asymptotic variance. They did not, however, formally establish the asymptotic properties of the estimator for the regression coefficients nor did they provide explicit expressions for the variance that acknowledge the inherent negative correlation among the cluster-specific sampling indicators. In this paper, we resolve these gaps by using results by Xie and Yang (2003) for GEE in the complete data setting to establish the asymptotic properties of the IPW-GEE estimator for cluster-based ODS designs. Through this, we derive an expression for the asymptotic variance and propose a corresponding consistent estimator.
In contrast to Cai et al. (2001) where the focus was on settings with a large number of small clusters (specifically pairs of eyes within the Baltimore Eye Study), our study of low birthweight risk factors in Rwanda consists of a small number of relatively large clusters. That the number of clusters is small may be of concern, specifically in regard to small-sample bias in point estimates (Paul and Zhang, 2014) and undercoverage of confidence intervals based on the usual sandwich estimator (Fay and Graubard, 2001; Kauermann and Carroll, 2001; Mancl and DeRouen, 2001; Pan and Wall, 2002; Morel et al., 2003). To the best of our knowledge, however, no attempts have been made to investigate the extent to which these issues manifest in the cluster-based ODS designs that are the focus of this paper. Furthermore, while small-sample corrections have been proposed in the complete data setting, these have not been adapted to the ODS setting. Therefore, a final contribution of this paper is that we provide expressions for a bias correction to the point estimates, as well as several bias corrections for the variance estimator.
2 Risk Factors for Low Birthweight in Rwanda
The motivating study for this paper is one that we recently conducted on risk factors for low birthweight (<2500g) among facility-based births in two districts in northern Rwanda (Gakenke and Rulindo) between April and June in 2017. Within these districts, pregnant women may receive care at one of K = 44 health centers. While patient records, that include information on the mother, the pregnancy and the infant, are maintained locally at the health centers, aggregated information on a range of health indicators are tallied on a monthly basis by each of the health centers and entered into the Rwanda Health Management Information System (HMIS), a centralized database maintained by the Rwandan Ministry of Health. Individual-level data, however, is not readily-available through the HMIS and must, therefore, be obtained by traveling to a given health center and abstracting the relevant information.
2.1 Sampling Design and Data Collection
At the design phase of this study, for a number of logistical and financial reasons, an early decision was made that individual-level patient data would only be collected from 18 health centers. Toward selecting which health centers to include, health center-specific information on the total number of births (i.e., ,
) as well as the (unadjusted) prevalence of low birthweight (i.e.,
) during the 3-month study period was made available by HMIS. Figure 1 provides a scatterplot; across the 44 health centers
ranged from 24 to 194 while
ranged from 0 to 0.13.

Scatterplot of health center-specific number of live births and prevalence of low birthweight births between April and June 2017, at each of K = 44 health centers in two districts in northern Rwanda. Relative sizes of the circles are proportional to the probability of selection into the outcome-dependent scheme, . Black shading indicates which of the 44 health centers were ultimately selected
The information available on and
was then used to form the following selection strategy: (i) the six health centers with the highest prevalence were sampled with probability
; (ii) the remaining 12 centers were sampled via Poisson sampling on the basis of selection probabilities determined by the model:
, where
is the maximum of clinic
s standardized rank with respect to outcome prevalence and its rank with respect to size (in this case, 44 is the highest prevalence/largest size, 1 is the lowest prevalence/smallest size). The value of θ0 was set to 15/38 to partially control the number of clusters sampled, and the value of θ1 was set to 0.1.
Note, the relative sizes of the circles in Figure 1 are proportional to the value of , while the black shading indicates which of the 44 health centers were ultimately selected. Intuitively, this strategy was adopted to balance inflating the prevalence of low birthweight in the sample (i.e., to artificially increase the prevalence of the outcome, a key feature of ODS schemes) with maximizing the total sample size in the subsample used in the analysis. The particular design we used for sampling the health centers was selected based on the results of a simulation study conducted beforehand to compare the efficiency gains of various designs that seek to achieve these two objectives.
After visiting several health centers, we observed that the high-risk deliveries are referred to the district hospitals. In order to have representation of these births in the sample, yet with the constraint of only being able to visit one more location for data collection, we randomly sampled one of the four district hospitals for data collection. Following data abstraction, patient-level information was available on 1635 live facility-based births. The data were restricted to singleton births with complete data on the variables of interest, yielding 1572 observations. Table 1 describes the maternal and newborn characteristics of these births. The low birthweight prevalence is higher among mothers younger than 20 years of age, among mothers who weigh less than 56 kg, and among women in their first pregnancy. The low birthweight prevalence is also higher among mothers with a history of abortion, c-section, or stillbirth. The low birthweight prevalence is slightly higher among female newborns, and the majority of preterm births are also low birthweight births.
. | LBW . | Not LBW . | Total . | % LBW . |
---|---|---|---|---|
Mother's age | ||||
<20 | 18 | 135 | 153 | 11.8 |
20–35 | 74 | 1060 | 1134 | 6.5 |
36–49 | 17 | 268 | 285 | 6.0 |
Mother's weight | ||||
<56 kg | 44 | 311 | 355 | 12.4 |
56–59 kg | 33 | 344 | 377 | 8.8 |
60–64 kg | 18 | 427 | 445 | 4.0 |
≥65 kg | 14 | 381 | 395 | 3.5 |
Birth order | ||||
1 | 55 | 382 | 437 | 12.6 |
2–3 | 39 | 653 | 692 | 5.6 |
4+ | 15 | 428 | 443 | 3.4 |
HIV status at admission | ||||
Positive | 1 | 24 | 25 | 4.0 |
Negative | 107 | 1433 | 1540 | 6.9 |
Unknown | 1 | 6 | 7 | 14.3 |
Previous abortion | ||||
Yes | 10 | 98 | 108 | 9.3 |
No | 99 | 1365 | 1464 | 6.8 |
Previous C-section | ||||
Yes | 14 | 20 | 34 | 41.2 |
No | 95 | 1443 | 1538 | 6.2 |
Previous stillbirth | ||||
Yes | 7 | 73 | 80 | 8.8 |
No | 102 | 1390 | 1492 | 6.8 |
District | ||||
Gakenke District | 73 | 719 | 792 | 9.2 |
Rulindo District | 36 | 744 | 780 | 4.6 |
Sex of newborn | ||||
Female | 62 | 745 | 807 | 7.7 |
Male | 47 | 718 | 765 | 6.1 |
Preterm birth | ||||
Preterm | 25 | 3 | 28 | 89.3 |
Not Preterm | 84 | 1460 | 1544 | 5.4 |
Total | 109 | 1463 | 1572 | 6.9 |
. | LBW . | Not LBW . | Total . | % LBW . |
---|---|---|---|---|
Mother's age | ||||
<20 | 18 | 135 | 153 | 11.8 |
20–35 | 74 | 1060 | 1134 | 6.5 |
36–49 | 17 | 268 | 285 | 6.0 |
Mother's weight | ||||
<56 kg | 44 | 311 | 355 | 12.4 |
56–59 kg | 33 | 344 | 377 | 8.8 |
60–64 kg | 18 | 427 | 445 | 4.0 |
≥65 kg | 14 | 381 | 395 | 3.5 |
Birth order | ||||
1 | 55 | 382 | 437 | 12.6 |
2–3 | 39 | 653 | 692 | 5.6 |
4+ | 15 | 428 | 443 | 3.4 |
HIV status at admission | ||||
Positive | 1 | 24 | 25 | 4.0 |
Negative | 107 | 1433 | 1540 | 6.9 |
Unknown | 1 | 6 | 7 | 14.3 |
Previous abortion | ||||
Yes | 10 | 98 | 108 | 9.3 |
No | 99 | 1365 | 1464 | 6.8 |
Previous C-section | ||||
Yes | 14 | 20 | 34 | 41.2 |
No | 95 | 1443 | 1538 | 6.2 |
Previous stillbirth | ||||
Yes | 7 | 73 | 80 | 8.8 |
No | 102 | 1390 | 1492 | 6.8 |
District | ||||
Gakenke District | 73 | 719 | 792 | 9.2 |
Rulindo District | 36 | 744 | 780 | 4.6 |
Sex of newborn | ||||
Female | 62 | 745 | 807 | 7.7 |
Male | 47 | 718 | 765 | 6.1 |
Preterm birth | ||||
Preterm | 25 | 3 | 28 | 89.3 |
Not Preterm | 84 | 1460 | 1544 | 5.4 |
Total | 109 | 1463 | 1572 | 6.9 |
. | LBW . | Not LBW . | Total . | % LBW . |
---|---|---|---|---|
Mother's age | ||||
<20 | 18 | 135 | 153 | 11.8 |
20–35 | 74 | 1060 | 1134 | 6.5 |
36–49 | 17 | 268 | 285 | 6.0 |
Mother's weight | ||||
<56 kg | 44 | 311 | 355 | 12.4 |
56–59 kg | 33 | 344 | 377 | 8.8 |
60–64 kg | 18 | 427 | 445 | 4.0 |
≥65 kg | 14 | 381 | 395 | 3.5 |
Birth order | ||||
1 | 55 | 382 | 437 | 12.6 |
2–3 | 39 | 653 | 692 | 5.6 |
4+ | 15 | 428 | 443 | 3.4 |
HIV status at admission | ||||
Positive | 1 | 24 | 25 | 4.0 |
Negative | 107 | 1433 | 1540 | 6.9 |
Unknown | 1 | 6 | 7 | 14.3 |
Previous abortion | ||||
Yes | 10 | 98 | 108 | 9.3 |
No | 99 | 1365 | 1464 | 6.8 |
Previous C-section | ||||
Yes | 14 | 20 | 34 | 41.2 |
No | 95 | 1443 | 1538 | 6.2 |
Previous stillbirth | ||||
Yes | 7 | 73 | 80 | 8.8 |
No | 102 | 1390 | 1492 | 6.8 |
District | ||||
Gakenke District | 73 | 719 | 792 | 9.2 |
Rulindo District | 36 | 744 | 780 | 4.6 |
Sex of newborn | ||||
Female | 62 | 745 | 807 | 7.7 |
Male | 47 | 718 | 765 | 6.1 |
Preterm birth | ||||
Preterm | 25 | 3 | 28 | 89.3 |
Not Preterm | 84 | 1460 | 1544 | 5.4 |
Total | 109 | 1463 | 1572 | 6.9 |
. | LBW . | Not LBW . | Total . | % LBW . |
---|---|---|---|---|
Mother's age | ||||
<20 | 18 | 135 | 153 | 11.8 |
20–35 | 74 | 1060 | 1134 | 6.5 |
36–49 | 17 | 268 | 285 | 6.0 |
Mother's weight | ||||
<56 kg | 44 | 311 | 355 | 12.4 |
56–59 kg | 33 | 344 | 377 | 8.8 |
60–64 kg | 18 | 427 | 445 | 4.0 |
≥65 kg | 14 | 381 | 395 | 3.5 |
Birth order | ||||
1 | 55 | 382 | 437 | 12.6 |
2–3 | 39 | 653 | 692 | 5.6 |
4+ | 15 | 428 | 443 | 3.4 |
HIV status at admission | ||||
Positive | 1 | 24 | 25 | 4.0 |
Negative | 107 | 1433 | 1540 | 6.9 |
Unknown | 1 | 6 | 7 | 14.3 |
Previous abortion | ||||
Yes | 10 | 98 | 108 | 9.3 |
No | 99 | 1365 | 1464 | 6.8 |
Previous C-section | ||||
Yes | 14 | 20 | 34 | 41.2 |
No | 95 | 1443 | 1538 | 6.2 |
Previous stillbirth | ||||
Yes | 7 | 73 | 80 | 8.8 |
No | 102 | 1390 | 1492 | 6.8 |
District | ||||
Gakenke District | 73 | 719 | 792 | 9.2 |
Rulindo District | 36 | 744 | 780 | 4.6 |
Sex of newborn | ||||
Female | 62 | 745 | 807 | 7.7 |
Male | 47 | 718 | 765 | 6.1 |
Preterm birth | ||||
Preterm | 25 | 3 | 28 | 89.3 |
Not Preterm | 84 | 1460 | 1544 | 5.4 |
Total | 109 | 1463 | 1572 | 6.9 |
3 Analysis Based on Complete Data
3.1 Marginal Model Specification
Consider the setting in which the scientific question of interest concerns learning about the relationship between some outcome Y and p-vector of covariates, (which may include a 1.0 for the intercept). Furthermore, suppose the population of interest is naturally clustered, such as is the case where patients are clustered within health centers. Let K denote the number of clusters and
the number of individuals in the kth cluster. In this paper, we assume that the scientific question at hand corresponds to an analysis where estimation and inference will be performed with respect to the following marginal mean model for the outcome of the ith individual in the kth cluster as a function of their covariates,
:

where is a user-chosen link function and
a p-vector of regression parameters.
3.2 Complete Data Analysis
Given complete data on () for all
individuals in the K clusters, Liang and Zeger (1986) proposed that estimation of
can be carried out by solving the following generalized estimating equations:

where =
, with
=
and
=
, and with
=
denoting the
matrix of partial derivatives. Finally,
, indexed by the unknown
, is an
working specification for
. As will become clear below, it will be useful to write
=
where
=
is an
matrix, where
=
,
=
,
is an
block-diagonal matrix, with the
on the diagonal, and
is the
matrix obtained by stacking the K
matrices.
4 Cluster-Based ODS
In some settings, analysts may not have access to complete data on all elements of () for all N individuals in the K clusters. They may, however, have access to resources that permit ascertainment of this information in a subsample of, say,
individuals. Furthermore, they may have access to select components of (
), as well as other variables/information that are not of direct relevance to the scientific question, denoted here by
, that can, in principle, be used to make decisions regarding the subsampling. Moving forward we refer to this information as being available at the design stage. How researchers choose to make use of the information available at the design stage will depend, in part, on the precise nature of the information as well as on practical/logistical/financial considerations regarding how the otherwise unavailable data elements will be ascertained.
Motivated by the study we conducted in Rwanda, suppose the readily-available information is of the form where
is a cluster-level summary of the outcomes (i.e., across the
individuals in the kth cluster),
is a cluster-level feature or a cluster-level summary of elements of
that are readily-available at the outset, and
is a cluster-level summary of
. For example, if Y is binary, as is low birthweight in the study reported on in Section 2, then
may be the proportion of babies born at the health center in the last 6 months with low birthweight. Furthermore,
may be a feature of the cluster, such as whether the health center is in a rural or urban setting, and/or it may be an aggregated summary of individual-level data, such as the percentage of mothers who are less than 18 years of age. Finally,
may be the prevalence of some other outcome or comorbidity that is routinely collected. Note, this data scenario is common in resource-limited settings where detailed information that is recorded and stored locally at health centers is only reported to a centralized agency/ministry after having been aggregated or otherwise summarized (Maokola et al., 2011; Nisingizwe et al., 2014; Haneuse et al., 2015).
The information represented by can, in principle, be used to inform a cluster-based ODS design (Cai et al., 2001). For this type of design, rather than selecting individuals directly, some subsample of
clusters is initially selected. Then, the otherwise unavailable elements of
are ascertained on all individuals within the sampled clusters. For the contexts we consider, and presented in Section 2, this would correspond to selecting a certain number of health centers, and then having a member of the research team travel to the health center to extract from the local records all relevant information on patients satisfying the study inclusion/exclusion criteria.
Let be a binary indicator of whether the kth cluster is selected and
the corresponding (known) probability of being selected, where
is the totality of the information available at the design stage. Toward operationalizing the sampling scheme, researchers may opt to use stratified sampling or Poisson sampling. For the first of these, the clusters are cross-classified on the basis of some variable S that is defined on the basis of one or more of the variables contained in
and is assumed to take on one of J levels. From this, suppose
clusters are classified as belonging to the jth stratum. We assume that the stratification scheme is specified such that
for all
. Then
clusters are randomly selected from those in the jth stratum, such that
=
. Note, for each of the clusters in the jth stratum, we have that
=
. For those that are selected in this way we set
= 1. Under the second option of Poisson sampling, one first prespecifies each of the
as a function of elements of
. For example, one could specify a logistic regression model for
as a function of the clusters' outcome prevalence. Whether a cluster is selected is then determined by an independent Bernoulli trial with probability
.
We conclude this section with a number of comments. First, we note that a key difference between the two approaches to selecting the subsample of clusters is that is predetermined under stratified sampling (and, therefore, fixed), but is random under Poisson sampling. Second, as we elaborate upon below, when the subsampling is based on stratification the
indicators for clusters within a given stratum are negatively correlated (although they are independent across strata). This, in turn, has implications for the variance of the estimator of
. Under Poisson sampling, however, since the trials are taken to be independent, the
are also independent (i.e., both within and between strata). Finally, the framework described above is sufficiently general that
need not necessarily be used to inform the stratification scheme or the prespecified model for
. Moving forward we assume, however, that
will be used at the design stage.
5 Analysis Based on Data from a Cluster-Based ODS Design
When the sampling of the clusters is based, in part, on information on the outcome, the usual generalized estimating equations given by expression (2) are no longer guaranteed to be unbiased (Qaqish et al., 1997). To resolve this, Cai et al. (2001) proposed that be estimated as the solution to the following weighted generalized estimating equations:

Note, following the development of Section 3.2, one can write , where
is an N× N diagonal matrix with diagonal entries equal to the vector
, with
a vector of length
with each element equal to
. Letting
denote the
× 1 vector with all entries equal to
,
is the
vector obtained by concatenating the
.
Let denote the solution to (3). In the remainder of this section, we extend Cai et al. (2001) by: (i) formally establishing the asymptotic properties of
; (ii) proposing an explicit formula for its asymptotic variance; and, (iii) proposing a suite of methods to correct for small-sample (i.e., when
is “small”) bias in the point and standard error estimates.
5.1 Asymptotic Properties
To establish the asymptotic properties of , we consider the setting where
while max
is bounded above, and assume missingness at random (MAR), in other words that
, where
are the individual-level covariate values for which only a summary is available at the design stage.
Detailed arguments in Web Appendix B, which build on those by Xie and Yang (2003), show that is consistent for
, the true value of
, and that

where =
,
=
,
=
, and
. Furthermore, from this we have that the asymptotic variance of
is

Finally, can be reexpressed as
, with
, where
represents the variance of the complete data estimating equations, and
the additional variance that arises due to having only complete data on the individuals in the sampled clusters.
5.2 Inference
Inference can be carried out in practice through use of the consistent plug-in estimator for the asymptotic variance of :

where , and
, and
.
In the expression for ,
is an
block-diagonal matrix where the entries of the kth
block are all equal to
, the inverse of the pairwise selection probability for any pair of individuals i and
in cluster k:
. In the expression for
,
is an
matrix with all entries in the kth block along the diagonal equal to
and all entries in the off-diagonal block corresponding to clusters k and
equal to
. Given data from a cluster-stratified design, if clusters k and
are both in stratum j, then
=
, while if they belong to different strata j and
, it follows that
=
. Under a Poisson sampling design,
=
for individuals from different clusters. In this case,
is a block-diagonal matrix, where the kth block is an
matrix with all values equal to
.
In addition, we propose a second estimator of the variance motivated (in part) by the form proposed by Cai et al. (2001). Toward this, let =
), where
is an
-vector with elements
. We propose that the asymptotic variance of
be estimated by:

where , with
=
. After some algebra, it can be shown that
, with

where is an
matrix with all entries
. Thus, expression (6) differs from expression (5) in that the former ignores the impact of negative correlation between the selection indicators. Note that under Poisson sampling,
.
5.3 Estimation and Inference in Small Samples
As in the complete data setting, estimation and inference for based on data from an outcome-dependent cluster-based design may be subject to small-sample bias. Indeed, it may be the norm for studies based on such a design to have a small number of clusters in the data set available for analysis. Motivated by this, we propose bias corrections to both the point estimates,
, and to
. For the former, the bias in the point estimates can be expanded to give
and, building on the work of Paul and Zhang (2014) and Lunardon and Scharfstein (2017), we derive the form of
, which can be estimated by
. Since the resulting expressions are quite involved, we leave the details to Web Appendix D. The bias-corrected point estimate is then
.
We consider four corrections to for small-sample settings, adapted from corrections proposed in the complete data setting (see Web Appendices E–G for details). The first is based on a simple “degrees-of-freedom” adjustment, to give
. The second follows the approach taken by Mancl and DeRouen (2001), and is given by

where and

where and
. The third is a Kauermann and Carroll (2001)-type correction,
, which is the same as (7), except with
. The fourth correction adapts the approach taken by Fay and Graubard (2001), and is given by

where and

where is a
diagonal matrix with the
th element equal to
, and
.
Finally, we make two additional comments. First, if the data arise from a cluster-stratified design then one could operationalize these corrections for the estimator of the variance that ignores the negative correlation in the sampling indicators (i.e., for given in Section 5.2), by dropping the
term in the expressions presented. Second, while the bias-corrected variance estimators
,
, and
are proposed for
, the same form could be adopted in practice using the bias-corrected point estimates
instead.
6 Simulation Study
To evaluate the operating characteristics of the methods proposed in Section 5, we performed two sets of simulation studies. The first set was designed using characteristics of the birth data set from Rwanda. The second was designed to evaluate the operating characteristics of the methods in a more general setting.
6.1 Data Generation
For the first simulation study, we suppose that interest lies in the following marginal logistic regression model for a binary response for the ith individual in the kth cluster:

where is a binary individual-level variable with a cluster-specific prevalence,
,
is a continuous cluster-level variable drawn from a Uniform[−2,2] distribution, and
= (−1.6, 0.5, 0.3). The value of β0 was chosen so that the prevalence of the outcome is around 21%. Complete data sets with K = 44 clusters were generated with the same size distribution as the 44 health centers in the Rwanda study.
For the second set of simulations, we suppose that interest lies in the model:

with a binary individual-level variable with the same distribution as in Simulation 1,
a continuous cluster-specific variable drawn from a N(0.2, 0.052) distribution, and
a binary cluster-level variable with prevalence 0.20. Finally,
is an individual-level variable generated from a Normal distribution with cluster-specific mean
and a variance of 1.0. We set
= (−3.1, 0.5, 0.5, 0.5, 1), where β0 was chosen so that the overall prevalence was about 13%. Complete data sets were generated with a total of K = 200 clusters with equal cluster sizes of
= 40.
In all simulations, the true correlation structure was exchangeable; we induced correlation among the observations using the GenBinaryY() function in the MMLB package for R, which implements a method for marginally specified logistic-Normal models (Heagerty, 1999). Briefly, the latter permits analysts to specify a marginal mean for the response while inducing within-cluster correlation via cluster-specific random intercepts that are taken to arise from a distribution. For Simulation 1, we varied
, and for Simulation 2, we set
. The correlation parameter, α, was estimated for each of these scenarios by fitting the model of interest to the generated complete data sets using a working exchangeable correlation structure; this resulted in an estimated correlated parameter of
for Simulation 1 and
for Simulation 2.
6.2 Sampling Designs
For each of the simulation scenarios we generated 10,000 “complete” data sets. For each of these, we considered three cluster-based designs for selecting clusters: (#1) simple random sampling of the clusters; (#2) an outcome-dependent cluster-stratified design, where clusters were stratified on
, the qth quantile of the distribution of the outcome prevalence across the clusters (with
in Simulation 1 and
in Simulation 2); and, (#3) an outcome-dependent cluster-stratified design where in Simulation 1 clusters were stratified on the basis of
and
, where
= I(
), while in Simulation 2 clusters were stratified on the basis of
and X4. Both designs (#2) and (#3) were “balanced” with respect to the stratification; an equal number of clusters was sampled from each stratum, with the exception of the cases in which
could not be evenly divided by the number of strata. In these cases, more clusters were sampled from the strata with higher outcome prevalence (
= 1). In Simulation 1, we considered designs with
, and in Simulation 2 we considered
.
6.3 Analyses
For each data set generated through the processes described in Sections 6.1 and 6.2, we computed three estimates of :
, based on (naïvely) solving the unweighted estimating equations given in Section 3.2 for the
sampled clusters;
, based on solving the weighted estimating equations given at the start of Section 5; and, the bias-corrected estimator,
, defined in Section 5.3. Note, under simple random sampling of the clusters,
and
will be numerically the same. Furthermore, we note that the working independence correlation structure was adopted in the specification of
when computing each of these estimators. The reason for this is that if
includes at least one covariate that varies across units within a cluster (as will almost always be the case), the complete data estimating equations, given by expression (2), are not guaranteed to be unbiased unless working independence is adopted (Pepe and Anderson, 1994). We ran one scenario (Simulation 1,
) with working exchangeable as well, in order to investigate the degree to which using a working correlation structure other than working independence affects the results.
To evaluate the proposed approaches to inference we focus attention on as a point estimate for
, as it generally exhibits little-to-no bias even when
is as small as 12 or 15. We then calculated six estimates of
, and thus the standard errors, using the methods described in Sections 5.2 and 5.3. The first three were:
, the unadjusted estimator; the degrees-of-freedom adjusted estimator,
; and the Mancl and DeRouen-type estimator,
. Another three were as these but ignoring the negative correlation in the sampling indicators:
,
, and
. In Simulation 1, we also computed the Kauermann and Carroll-type estimator,
/
and the Fay and Graubard-type estimator,
/
, as well as all of the above variance estimates using the uncorrected point estimates,
, in place of
. The final approach toward carrying out inference that we considered was a smoothed bootstrap approach adapted from the work of Li and Wang (2008), with details given in Web Appendix H. Using each of the standard error estimates, we constructed 95% confidence intervals using both the normal distribution and the
distribution, estimating coverage as the proportion of iterations in which the constructed confidence intervals contained the true parameter values.
6.4 Results: Point Estimation
Figures 2 and 3 summarize the mean absolute bias in the point estimates (across the 10,000 such estimates) for each of the three sampling designs, under the scenarios considered in Simulation 1 and Simulation 2. Based on these results, we make the following observations:

The absolute bias in the mean of the unweighted point estimates , weighted uncorrected point estimates
, and the bias-corrected weighted point estimates
in Simulation 1, as a function of
, under designs #1, #2, and #3. The degree of correlation was determined by
. The true parameter values are given by
) = (–1.6, 0.5, 0.3). Under design #1,
and
are equivalent

The absolute bias in the mean of the unweighted point estimates , weighted uncorrected point estimates
, and the bias-corrected weighted point estimates
in Simulation 2, as a function of
, under designs #1, #2, and #3. The degree of correlation was determined by
. The true parameter values are given by
) = (−3.1, 0.5, 0.5, 0.5, 1). Under design #1,
and
are equivalent
First, in general, exhibits bias except, as is to be expected, under simple random sampling when
is relatively large. Second, for the two outcome-dependent schemes, the weighted estimator
generally exhibits less bias than
. While there appear to be some exceptions to this (see Figure 2, and Figures 3f, h, and i), the extent of bias is very small and it is unclear if any meaning can be attributed to the relative ordering.
Third, the bias in is generally lower than the bias in
; the additional use of the bias-correction method proposed in Section 5.3 generally reduces bias, albeit to a small degree. An exception to this arises in Figure 2(b), although the difference is very small. Additional exceptions are in Figures 3(j), (k), (m), and (o), each of which correspond to one of the cluster-level covariates, X3 and X4. This may speak to a fundamental difficulty of estimating associations for cluster-level covariates when
is small, one that may only be resolved through the collection of additional data (rather than through analytic means). From Figure 4, we also see that the extent of bias may depend on the prevalence of a cluster-level covariate that is used to define S in a cluster-stratified design.

The absolute bias in the mean of the unweighted point estimate , weighted uncorrected point estimate
, and the bias-corrected weighted point estimate
in Simulation 2 as a function of the prevalence of
, with the degree of correlation determined by
.
Finally, the bias-corrected weighted point estimates, , generally exhibit little-to-no bias across all parameters, even when
is a low as 12 or 15.
6.5 Results: Coverage
Tables 2 and 3 report select results regarding coverage of Wald-based 95% confidence intervals in Simulations 1 and 2, respectively. Comprehensive results, specifically for coverage under a broader range of values for and
, as well as coverage using the t distribution, use of the uncorrected point estimates in standard error estimation, and use of working exchangeable correlation matrix, are given in Web Appendix J.
Estimated coverage of Wald-based 95% confidence intervals in Simulation 1 using methods proposed in Sections 5.2 and 5.3, using the bias-corrected weighted estimator, , as the point estimate
. | ![]() | . | . | . | . | . |
---|---|---|---|---|---|---|
Parameter . | Design . | ![]() | ![]() | ![]() | ![]() | ![]() |
β0 | 12/ ![]() | 0.89 | 0.93 | 0.94 | 0.91 | 0.90 |
![]() | 0.89 | 0.92 | 0.95 | 0.92 | 0.91 | |
![]() | 0.85 | 0.89 | 0.93 | 0.89 | 0.88 | |
24/ ![]() | 0.92 | 0.94 | 0.95 | 0.93 | 0.93 | |
![]() | 0.92 | 0.94 | 0.95 | 0.94 | 0.93 | |
![]() | 0.91 | 0.92 | 0.94 | 0.92 | 0.92 | |
β1 | 12/ ![]() | 0.90 | 0.94 | 0.94 | 0.92 | 0.92 |
![]() | 0.89 | 0.93 | 0.94 | 0.92 | 0.92 | |
![]() | 0.87 | 0.91 | 0.93 | 0.90 | 0.90 | |
24/ ![]() | 0.92 | 0.94 | 0.95 | 0.94 | 0.94 | |
![]() | 0.92 | 0.94 | 0.94 | 0.93 | 0.93 | |
![]() | 0.92 | 0.94 | 0.95 | 0.93 | 0.93 | |
β2 | 12/ ![]() | 0.87 | 0.91 | 0.94 | 0.91 | 0.90 |
![]() | 0.86 | 0.90 | 0.93 | 0.90 | 0.88 | |
![]() | 0.85 | 0.89 | 0.94 | 0.90 | 0.88 | |
24/ ![]() | 0.91 | 0.93 | 0.94 | 0.93 | 0.92 | |
![]() | 0.91 | 0.93 | 0.95 | 0.93 | 0.92 | |
![]() | 0.91 | 0.93 | 0.95 | 0.93 | 0.92 |
. | ![]() | . | . | . | . | . |
---|---|---|---|---|---|---|
Parameter . | Design . | ![]() | ![]() | ![]() | ![]() | ![]() |
β0 | 12/ ![]() | 0.89 | 0.93 | 0.94 | 0.91 | 0.90 |
![]() | 0.89 | 0.92 | 0.95 | 0.92 | 0.91 | |
![]() | 0.85 | 0.89 | 0.93 | 0.89 | 0.88 | |
24/ ![]() | 0.92 | 0.94 | 0.95 | 0.93 | 0.93 | |
![]() | 0.92 | 0.94 | 0.95 | 0.94 | 0.93 | |
![]() | 0.91 | 0.92 | 0.94 | 0.92 | 0.92 | |
β1 | 12/ ![]() | 0.90 | 0.94 | 0.94 | 0.92 | 0.92 |
![]() | 0.89 | 0.93 | 0.94 | 0.92 | 0.92 | |
![]() | 0.87 | 0.91 | 0.93 | 0.90 | 0.90 | |
24/ ![]() | 0.92 | 0.94 | 0.95 | 0.94 | 0.94 | |
![]() | 0.92 | 0.94 | 0.94 | 0.93 | 0.93 | |
![]() | 0.92 | 0.94 | 0.95 | 0.93 | 0.93 | |
β2 | 12/ ![]() | 0.87 | 0.91 | 0.94 | 0.91 | 0.90 |
![]() | 0.86 | 0.90 | 0.93 | 0.90 | 0.88 | |
![]() | 0.85 | 0.89 | 0.94 | 0.90 | 0.88 | |
24/ ![]() | 0.91 | 0.93 | 0.94 | 0.93 | 0.92 | |
![]() | 0.91 | 0.93 | 0.95 | 0.93 | 0.92 | |
![]() | 0.91 | 0.93 | 0.95 | 0.93 | 0.92 |
Note. Shown are results for estimators that acknowledge negative correlation in the selection indicators. See Section 6 for details.
Estimated coverage of Wald-based 95% confidence intervals in Simulation 1 using methods proposed in Sections 5.2 and 5.3, using the bias-corrected weighted estimator, , as the point estimate
. | ![]() | . | . | . | . | . |
---|---|---|---|---|---|---|
Parameter . | Design . | ![]() | ![]() | ![]() | ![]() | ![]() |
β0 | 12/ ![]() | 0.89 | 0.93 | 0.94 | 0.91 | 0.90 |
![]() | 0.89 | 0.92 | 0.95 | 0.92 | 0.91 | |
![]() | 0.85 | 0.89 | 0.93 | 0.89 | 0.88 | |
24/ ![]() | 0.92 | 0.94 | 0.95 | 0.93 | 0.93 | |
![]() | 0.92 | 0.94 | 0.95 | 0.94 | 0.93 | |
![]() | 0.91 | 0.92 | 0.94 | 0.92 | 0.92 | |
β1 | 12/ ![]() | 0.90 | 0.94 | 0.94 | 0.92 | 0.92 |
![]() | 0.89 | 0.93 | 0.94 | 0.92 | 0.92 | |
![]() | 0.87 | 0.91 | 0.93 | 0.90 | 0.90 | |
24/ ![]() | 0.92 | 0.94 | 0.95 | 0.94 | 0.94 | |
![]() | 0.92 | 0.94 | 0.94 | 0.93 | 0.93 | |
![]() | 0.92 | 0.94 | 0.95 | 0.93 | 0.93 | |
β2 | 12/ ![]() | 0.87 | 0.91 | 0.94 | 0.91 | 0.90 |
![]() | 0.86 | 0.90 | 0.93 | 0.90 | 0.88 | |
![]() | 0.85 | 0.89 | 0.94 | 0.90 | 0.88 | |
24/ ![]() | 0.91 | 0.93 | 0.94 | 0.93 | 0.92 | |
![]() | 0.91 | 0.93 | 0.95 | 0.93 | 0.92 | |
![]() | 0.91 | 0.93 | 0.95 | 0.93 | 0.92 |
. | ![]() | . | . | . | . | . |
---|---|---|---|---|---|---|
Parameter . | Design . | ![]() | ![]() | ![]() | ![]() | ![]() |
β0 | 12/ ![]() | 0.89 | 0.93 | 0.94 | 0.91 | 0.90 |
![]() | 0.89 | 0.92 | 0.95 | 0.92 | 0.91 | |
![]() | 0.85 | 0.89 | 0.93 | 0.89 | 0.88 | |
24/ ![]() | 0.92 | 0.94 | 0.95 | 0.93 | 0.93 | |
![]() | 0.92 | 0.94 | 0.95 | 0.94 | 0.93 | |
![]() | 0.91 | 0.92 | 0.94 | 0.92 | 0.92 | |
β1 | 12/ ![]() | 0.90 | 0.94 | 0.94 | 0.92 | 0.92 |
![]() | 0.89 | 0.93 | 0.94 | 0.92 | 0.92 | |
![]() | 0.87 | 0.91 | 0.93 | 0.90 | 0.90 | |
24/ ![]() | 0.92 | 0.94 | 0.95 | 0.94 | 0.94 | |
![]() | 0.92 | 0.94 | 0.94 | 0.93 | 0.93 | |
![]() | 0.92 | 0.94 | 0.95 | 0.93 | 0.93 | |
β2 | 12/ ![]() | 0.87 | 0.91 | 0.94 | 0.91 | 0.90 |
![]() | 0.86 | 0.90 | 0.93 | 0.90 | 0.88 | |
![]() | 0.85 | 0.89 | 0.94 | 0.90 | 0.88 | |
24/ ![]() | 0.91 | 0.93 | 0.94 | 0.93 | 0.92 | |
![]() | 0.91 | 0.93 | 0.95 | 0.93 | 0.92 | |
![]() | 0.91 | 0.93 | 0.95 | 0.93 | 0.92 |
Note. Shown are results for estimators that acknowledge negative correlation in the selection indicators. See Section 6 for details.
Estimated coverage of Wald-based 95% confidence intervals in Simulation 2 using methods proposed in Sections 5.2 and 5.3, using the bias-corrected weighted estimator, , as the point estimate
. | . | ![]() | ![]() | ||||
---|---|---|---|---|---|---|---|
Parameter . | Design . | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
β0 | #1 | 0.86 | 0.92 | 0.94 | 0.92 | 0.93 | 0.94 |
#2 | 0.86 | 0.92 | 0.94 | 0.92 | 0.94 | 0.95 | |
#3 | 0.80 | 0.87 | 0.94 | 0.91 | 0.93 | 0.94 | |
β1 | #1 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 |
#2 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 | |
#3 | 0.84 | 0.90 | 0.91 | 0.93 | 0.94 | 0.94 | |
β2 | #1 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 |
#2 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 | |
#3 | 0.84 | 0.90 | 0.91 | 0.93 | 0.94 | 0.94 | |
β3 | #1 | 0.85 | 0.91 | 0.94 | 0.92 | 0.93 | 0.95 |
#2 | 0.85 | 0.91 | 0.94 | 0.92 | 0.94 | 0.95 | |
#3 | 0.79 | 0.86 | 0.94 | 0.91 | 0.92 | 0.94 | |
β4 | #1 | 0.78 | 0.85 | 0.88 | 0.92 | 0.93 | 0.95 |
#2 | 0.87 | 0.93 | 0.95 | 0.91 | 0.92 | 0.93 | |
#3 | 0.84 | 0.90 | 0.95 | 0.93 | 0.94 | 0.95 |
. | . | ![]() | ![]() | ||||
---|---|---|---|---|---|---|---|
Parameter . | Design . | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
β0 | #1 | 0.86 | 0.92 | 0.94 | 0.92 | 0.93 | 0.94 |
#2 | 0.86 | 0.92 | 0.94 | 0.92 | 0.94 | 0.95 | |
#3 | 0.80 | 0.87 | 0.94 | 0.91 | 0.93 | 0.94 | |
β1 | #1 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 |
#2 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 | |
#3 | 0.84 | 0.90 | 0.91 | 0.93 | 0.94 | 0.94 | |
β2 | #1 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 |
#2 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 | |
#3 | 0.84 | 0.90 | 0.91 | 0.93 | 0.94 | 0.94 | |
β3 | #1 | 0.85 | 0.91 | 0.94 | 0.92 | 0.93 | 0.95 |
#2 | 0.85 | 0.91 | 0.94 | 0.92 | 0.94 | 0.95 | |
#3 | 0.79 | 0.86 | 0.94 | 0.91 | 0.92 | 0.94 | |
β4 | #1 | 0.78 | 0.85 | 0.88 | 0.92 | 0.93 | 0.95 |
#2 | 0.87 | 0.93 | 0.95 | 0.91 | 0.92 | 0.93 | |
#3 | 0.84 | 0.90 | 0.95 | 0.93 | 0.94 | 0.95 |
Note. Shown are results for estimators that acknowledge negative correlation in the selection indicators. See Section 6 for details.
Estimated coverage of Wald-based 95% confidence intervals in Simulation 2 using methods proposed in Sections 5.2 and 5.3, using the bias-corrected weighted estimator, , as the point estimate
. | . | ![]() | ![]() | ||||
---|---|---|---|---|---|---|---|
Parameter . | Design . | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
β0 | #1 | 0.86 | 0.92 | 0.94 | 0.92 | 0.93 | 0.94 |
#2 | 0.86 | 0.92 | 0.94 | 0.92 | 0.94 | 0.95 | |
#3 | 0.80 | 0.87 | 0.94 | 0.91 | 0.93 | 0.94 | |
β1 | #1 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 |
#2 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 | |
#3 | 0.84 | 0.90 | 0.91 | 0.93 | 0.94 | 0.94 | |
β2 | #1 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 |
#2 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 | |
#3 | 0.84 | 0.90 | 0.91 | 0.93 | 0.94 | 0.94 | |
β3 | #1 | 0.85 | 0.91 | 0.94 | 0.92 | 0.93 | 0.95 |
#2 | 0.85 | 0.91 | 0.94 | 0.92 | 0.94 | 0.95 | |
#3 | 0.79 | 0.86 | 0.94 | 0.91 | 0.92 | 0.94 | |
β4 | #1 | 0.78 | 0.85 | 0.88 | 0.92 | 0.93 | 0.95 |
#2 | 0.87 | 0.93 | 0.95 | 0.91 | 0.92 | 0.93 | |
#3 | 0.84 | 0.90 | 0.95 | 0.93 | 0.94 | 0.95 |
. | . | ![]() | ![]() | ||||
---|---|---|---|---|---|---|---|
Parameter . | Design . | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
β0 | #1 | 0.86 | 0.92 | 0.94 | 0.92 | 0.93 | 0.94 |
#2 | 0.86 | 0.92 | 0.94 | 0.92 | 0.94 | 0.95 | |
#3 | 0.80 | 0.87 | 0.94 | 0.91 | 0.93 | 0.94 | |
β1 | #1 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 |
#2 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 | |
#3 | 0.84 | 0.90 | 0.91 | 0.93 | 0.94 | 0.94 | |
β2 | #1 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 |
#2 | 0.91 | 0.96 | 0.94 | 0.94 | 0.95 | 0.95 | |
#3 | 0.84 | 0.90 | 0.91 | 0.93 | 0.94 | 0.94 | |
β3 | #1 | 0.85 | 0.91 | 0.94 | 0.92 | 0.93 | 0.95 |
#2 | 0.85 | 0.91 | 0.94 | 0.92 | 0.94 | 0.95 | |
#3 | 0.79 | 0.86 | 0.94 | 0.91 | 0.92 | 0.94 | |
β4 | #1 | 0.78 | 0.85 | 0.88 | 0.92 | 0.93 | 0.95 |
#2 | 0.87 | 0.93 | 0.95 | 0.91 | 0.92 | 0.93 | |
#3 | 0.84 | 0.90 | 0.95 | 0.93 | 0.94 | 0.95 |
Note. Shown are results for estimators that acknowledge negative correlation in the selection indicators. See Section 6 for details.
From the first column of results in Tables 2 and 3, use of the plug-in estimator, , generally leads to undercoverage when
or 15, respectively. This is especially the case for design #3 where the coverage ranges from 0.85 to 0.87 in Table 2, and is even more pronounced in the more complex setting of Simulation 2 with coverage ranging from 0.79 to 0.84. When
is increased to 24 or 50 the performance of confidence intervals based on
is improved, though there is still undercoverage for all parameters.
Use of any of the proposed small-sample corrections yields improved coverage compared to the unadjusted approach. Among these, the Mancl and DeRouen-type estimator, , generally exhibits the closest to nominal coverage when the standard normal distribution is used for confidence interval construction. When the
distribution is used for confidence interval construction (see Web Appendix J), the Mancl and DeRouen-type estimator tends to be conservative, particularly when working exchangeable is used or
, while the other bias-corrected variance estimators see improved coverage, though generally remain anticonservative. Across all variance estimators, regardless of whether the normal or t distribution is used for confidence interval construction, the coverage resulting from using estimators with
is generally very similar to the coverage when using
. Furthermore, the difference in coverage when using the variance estimators ignoring the negative correlation in the selection indicators generally decreases as
increases, with the directionality of the difference for smaller
depending on the parameter in question, design used, and the degree of within-cluster correlation. The findings also hold when working exchangeable correlation structure is used as opposed to working independence.
7 Data Application
7.1 Analysis
For the purpose of applying the methods proposed in Section 5, we focus attention on a logistic regression model for the binary outcome of low birthweight as a function of seven covariates: maternal age, in years (categorized as 1: <20; 2: 20–35; 3: 36–49); maternal weight, in kg (categorized as 1: <56; 2: 56–59; 3: 60–64; 4: ≥65); birth order (categorized as 1: first; 2: second or third; 3: fourth or higher); sex of the newborn (1: female; 0: male); whether the mother had a previous abortion (1: yes; 0: no), whether the woman had a previous stillbirth (1: yes; 0: no); and, the district the health center/district hospital is located in (1: Rulindo; 0: Gakenke). For this model, we estimated and report adjusted odds ratios (OR) based on: , the solution to the unweighted estimating equations given in Section 3.2;
, based on solving the weighted estimating equations given at the start of Section 5; and, the bias-corrected estimator,
, defined in Section 5.3. In addition, we calculated Wald-based 95% confidence intervals (CI) for the bias-corrected estimator of each OR based on
, the plug-in estimator given by expression (5); the degrees-of-freedom adjusted estimator,
; and the Mancl and DeRouen-type estimator,
. Finally, as in the simulation studies, working independence was adopted throughout (see Section 6.3).
7.2 Results
Table 4 provides a summary of the results. We generally see a difference in the unweighted, weighted, and bias-corrected point estimates, though the story (direction) remains the same: among women weighing less than 56 kg, giving birth to a male as their first child in Gakenke district, with no history of abortion or previous stillbirth, younger women (<20) and older women (ages 36–49) have a higher odds of having a low birthweight baby than women ages 25–34. After adjusting for the other covariates in the model, the higher the mother's weight the lower the odds of having a low birthweight baby. Similarly, after adjusting for the other covariates in the model, as birth order increases women have lower odds of having a low birthweight baby than women having their first child, female newborns have a higher odds of being low birthweight compared to male newborns, and women who have had a previous abortion or have had a previous stillbirth have a higher odds of having a low birthweight birth compared to women who have no history of previous abortions or no history of previous stillbirths, respectively.
. | Point estimate . | 95% CI for ![]() | ||||
---|---|---|---|---|---|---|
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | |
Maternal age | ||||||
<20 years | 1.15 | 1.11 | 1.12 | (0.78, 1.60) | (0.62, 2.02) | (0.78, 1.61) |
36–49 years | 1.90 | 2.08 | 2.01 | (1.40, 2.88) | (1.11, 3.63) | (0.97, 4.18) |
Maternal weight | ||||||
56–59 kg | 0.70 | 0.60 | 0.59 | (0.36, 0.97) | (0.26, 1.34) | (0.26, 1.34) |
60–64 kg | 0.32 | 0.26 | 0.26 | (0.14, 0.49) | (0.09, 0.74) | (0.10, 0.71) |
≥65 kg | 0.29 | 0.23 | 0.24 | (0.11, 0.51) | (0.07, 0.84) | (0.08, 0.74) |
Birth order | ||||||
2–3 | 0.42 | 0.44 | 0.49 | (0.32, 0.75) | (0.25, 0.99) | (0.33, 0.73) |
4+ | 0.15 | 0.11 | 0.13 | (0.08, 0.22) | (0.05, 0.31) | (0.05, 0.36) |
Female newborn | 1.40 | 1.68 | 1.72 | (1.07, 2.75) | (0.73, 3.74) | (0.93, 3.19) |
Previous abortion | 1.95 | 2.25 | 2.17 | (1.32, 3.57) | (0.96, 4.93) | (0.97, 4.87) |
Previous stillbirth | 1.98 | 2.51 | 2.31 | (1.18, 4.52) | (0.76, 6.98) | (1.05, 5.09) |
Rulindo District | 0.51 | 0.28 | 0.26 | (0.08, 0.92) | (0.03, 2.08) | (0.05, 1.38) |
. | Point estimate . | 95% CI for ![]() | ||||
---|---|---|---|---|---|---|
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | |
Maternal age | ||||||
<20 years | 1.15 | 1.11 | 1.12 | (0.78, 1.60) | (0.62, 2.02) | (0.78, 1.61) |
36–49 years | 1.90 | 2.08 | 2.01 | (1.40, 2.88) | (1.11, 3.63) | (0.97, 4.18) |
Maternal weight | ||||||
56–59 kg | 0.70 | 0.60 | 0.59 | (0.36, 0.97) | (0.26, 1.34) | (0.26, 1.34) |
60–64 kg | 0.32 | 0.26 | 0.26 | (0.14, 0.49) | (0.09, 0.74) | (0.10, 0.71) |
≥65 kg | 0.29 | 0.23 | 0.24 | (0.11, 0.51) | (0.07, 0.84) | (0.08, 0.74) |
Birth order | ||||||
2–3 | 0.42 | 0.44 | 0.49 | (0.32, 0.75) | (0.25, 0.99) | (0.33, 0.73) |
4+ | 0.15 | 0.11 | 0.13 | (0.08, 0.22) | (0.05, 0.31) | (0.05, 0.36) |
Female newborn | 1.40 | 1.68 | 1.72 | (1.07, 2.75) | (0.73, 3.74) | (0.93, 3.19) |
Previous abortion | 1.95 | 2.25 | 2.17 | (1.32, 3.57) | (0.96, 4.93) | (0.97, 4.87) |
Previous stillbirth | 1.98 | 2.51 | 2.31 | (1.18, 4.52) | (0.76, 6.98) | (1.05, 5.09) |
Rulindo District | 0.51 | 0.28 | 0.26 | (0.08, 0.92) | (0.03, 2.08) | (0.05, 1.38) |
Note. See Section 7.1 for details.
. | Point estimate . | 95% CI for ![]() | ||||
---|---|---|---|---|---|---|
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | |
Maternal age | ||||||
<20 years | 1.15 | 1.11 | 1.12 | (0.78, 1.60) | (0.62, 2.02) | (0.78, 1.61) |
36–49 years | 1.90 | 2.08 | 2.01 | (1.40, 2.88) | (1.11, 3.63) | (0.97, 4.18) |
Maternal weight | ||||||
56–59 kg | 0.70 | 0.60 | 0.59 | (0.36, 0.97) | (0.26, 1.34) | (0.26, 1.34) |
60–64 kg | 0.32 | 0.26 | 0.26 | (0.14, 0.49) | (0.09, 0.74) | (0.10, 0.71) |
≥65 kg | 0.29 | 0.23 | 0.24 | (0.11, 0.51) | (0.07, 0.84) | (0.08, 0.74) |
Birth order | ||||||
2–3 | 0.42 | 0.44 | 0.49 | (0.32, 0.75) | (0.25, 0.99) | (0.33, 0.73) |
4+ | 0.15 | 0.11 | 0.13 | (0.08, 0.22) | (0.05, 0.31) | (0.05, 0.36) |
Female newborn | 1.40 | 1.68 | 1.72 | (1.07, 2.75) | (0.73, 3.74) | (0.93, 3.19) |
Previous abortion | 1.95 | 2.25 | 2.17 | (1.32, 3.57) | (0.96, 4.93) | (0.97, 4.87) |
Previous stillbirth | 1.98 | 2.51 | 2.31 | (1.18, 4.52) | (0.76, 6.98) | (1.05, 5.09) |
Rulindo District | 0.51 | 0.28 | 0.26 | (0.08, 0.92) | (0.03, 2.08) | (0.05, 1.38) |
. | Point estimate . | 95% CI for ![]() | ||||
---|---|---|---|---|---|---|
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | |
Maternal age | ||||||
<20 years | 1.15 | 1.11 | 1.12 | (0.78, 1.60) | (0.62, 2.02) | (0.78, 1.61) |
36–49 years | 1.90 | 2.08 | 2.01 | (1.40, 2.88) | (1.11, 3.63) | (0.97, 4.18) |
Maternal weight | ||||||
56–59 kg | 0.70 | 0.60 | 0.59 | (0.36, 0.97) | (0.26, 1.34) | (0.26, 1.34) |
60–64 kg | 0.32 | 0.26 | 0.26 | (0.14, 0.49) | (0.09, 0.74) | (0.10, 0.71) |
≥65 kg | 0.29 | 0.23 | 0.24 | (0.11, 0.51) | (0.07, 0.84) | (0.08, 0.74) |
Birth order | ||||||
2–3 | 0.42 | 0.44 | 0.49 | (0.32, 0.75) | (0.25, 0.99) | (0.33, 0.73) |
4+ | 0.15 | 0.11 | 0.13 | (0.08, 0.22) | (0.05, 0.31) | (0.05, 0.36) |
Female newborn | 1.40 | 1.68 | 1.72 | (1.07, 2.75) | (0.73, 3.74) | (0.93, 3.19) |
Previous abortion | 1.95 | 2.25 | 2.17 | (1.32, 3.57) | (0.96, 4.93) | (0.97, 4.87) |
Previous stillbirth | 1.98 | 2.51 | 2.31 | (1.18, 4.52) | (0.76, 6.98) | (1.05, 5.09) |
Rulindo District | 0.51 | 0.28 | 0.26 | (0.08, 0.92) | (0.03, 2.08) | (0.05, 1.38) |
Note. See Section 7.1 for details.
The confidence intervals using both the degrees-of-freedom adjusted standard errors and the Mancl and DeRouen-type adjusted standard errors generally result in wider confidence intervals than those constructed using the unadjusted standard errors. In some instances, one correction maintains significance while the other does not: for the effect of stillbirth, the confidence interval using the degrees-of-freedom adjusted standard errors contains 1, suggesting that the effect is not statistically significant, whereas the confidence interval using the Mancl and DeRouen-type adjustment does not contain 1, suggesting a statistically significant effect; conversely, for the effect of maternal age (36–39), the confidence interval using the degrees-of-freedom adjusted standard errors suggests a statistically significant effect, while that constructed using the Mancl and DeRouen-type adjustment does not. Given the conflicting results, we recommend the confidence intervals using the Mancl and DeRouen-type correction to the variance estimator based on the simulation study results. The simulations, in particular Simulation 2/=15, suggest that sometimes
while at other times
, but that in general,
yields the closest to nominal coverage.
8 Discussion
In this paper, we consider cluster-based ODS in resource-limited settings. Within this context, we extend the work of Cai et al. (2001) by formally establishing the asymptotic properties of the IPW-GEE estimator, and provide an explicit expression for the asymptotic variance. In addition, motivated by our own study in Rwanda, we propose several small-sample bias corrections to both the point estimates and estimates of the asymptotic variance. Our simulation results suggest that there is no clear overarching story in terms of one approach consistently outperforming the others.
With regard to the point estimates, our simulations indicate that the bias-corrected point estimates generally reduce the bias in the point estimates, though the improvement is small in most cases. The impact of the bias-corrected standard errors, on the other hand, is much more substantial. The unadjusted standard errors result in severe undercoverage when is small, and all of the bias-corrected standard errors yield closer to nominal coverage. Among the bias corrections to the variance estimator that we considered, the Mancl and DeRouen-type correction generally yielded the closest to nominal coverage when the normal distribution was used for confidence interval construction. When the
distribution was used for confidence interval construction, the Mancl and DeRouen-type correction could be conservative, particularly when the degree of correlation was increased. While the use of the
distribution improved the coverage of the other bias corrections, there was no method that consistently performed better than the others, and these methods still suffered from some undercoverage and overcoverage, depending on the design and parameter. Other degrees of freedom have been proposed when using the t distribution for confidence interval construction, and may yield improved coverage in this setting as well. Furthermore, once
was at least 50, the standard errors taking into account the covariance of the selection indicators and the standard errors ignoring the covariance of the selection indicators, generally resulted in similar coverage, indicating that asympototically, even the naïve method does not perform so poorly. This result held for both the unadjusted and the adjusted standard errors. When
was small, however, the narrative was mixed.
For these reasons, while acknowledging that there is not one superior method, and that the decision regarding which approach to take is in the hands of the researcher, we offer our perspective on how we would proceed: bias-corrected point estimates, and confidence intervals constructed using the normal distribution together with the variance taking into account the covariance of the selection indicators, adjusted with the Mancl and DeRouen-type correction. We stress the importance of using a bias correction to the variance estimator, while using uncorrected (but weighted) point estimates will, based on our results, generally not have a substantial impact on the analysis. The observation that one approach does not consistently outperform all of the others is consistent with results from papers comparing different small-sample bias corrections to the robust sandwich estimator in the complete data setting—no single small-sample correction consistently outperforms other small-sample adjustments. Which correction performs better depends on a variety of factors, including the number of clusters, the size of the clusters, the degree of variability in the cluster sizes, the degree of correlation among the clusters, and the type (cluster-level vs. individual-level) and distribution of the covariates in the model of interest. The theory that is presented relies on asymptotics, though in small samples, we are not aware of theory establishing small-sample behavior. Researchers must make decisions on what to do based on the characteristics of their data, and the objectives of their analysis. Finally, the simulations in this paper solely consider binary outcomes; more research is needed to determine the performance of these methods for continuous or count outcomes.
The focus of this paper is how to carry out estimation and inference for a data set collected through a cluster-based ODS design, and on considerations that a researcher must make when the number of clusters in the subsample is small. An open question, however, is how to identify an optimal sampling design for selecting the clusters at the design stage. As explained in Section 2.1, the specific design we adopted in Rwanda arose through efforts to balance financial/logistic considerations with statistical power/efficiency, with the latter assumed to be driven, in part at least, by the prevalence of the outcome and the sample size. While this informal strategy may intuitively be reasonable, how to make optimal (or at the very least wise) choices regarding a stratification scheme or a choice of model for in this context is a topic of our ongoing research. Another approach to increasing efficiency is to use pieces of information known at the design stage to calibrate the weights used in the weighted estimating equations (Breidt et al., 2017; Rivera-Rodriguez et al., 2019); this, too, is an area for future research.
Data Availability Statement
The data that support the findings of this paper are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
Acknowledgments
Ms. Sauer was funded by the Harvard T.H. Chan School of Public Health NIEHS-sponsored Environmental Training Grant T32ES007142, and the Rose Traveling Fellowship Program in Chronic Disease Epidemiology and Biostatistics. Dr. Haneuse was supported by NIH grant R01 HL094786. Dr. Rivera-Rodriguez was supported by University of Auckland-Science/FRDF New Staff - 3716994. The Rwanda example included data collected as part of the All Babies Count program funded by Grand Challenges Canada Saving Lives at Birth, implemented by Partners In Health/Inshuti Mu Buzima and the Ministry of Health. The authors would like to thank Alphonse Nshimyiryo, Catherine Kirk, Ibrahim Hakizimana, and Robert Mutsinzi for their support in coordination and implementation of data collection in Rwanda.