Uniformly minimum variance conditionally unbiased estimation in multi-arm multi-stage clinical trials

SUMMARYMulti-arm multi-stage clinical trials compare several experimental treatments with a control treatment, with poorly performing treatments dropped at interim analyses. This leads to inferential challenges, including the construction of unbiased treatment effect estimators. A number of estimators which are unbiased conditional on treatment selection have been proposed, but are specific to certain selection rules, may ignore the comparison to the control and are not all minimum variance. We obtain estimators for treatment effects compared to the control that are uniformly minimum variance unbiased conditional on selection with any specified rule or stopping for futility.


INTRODUCTION
Multi-arm multi-stage clinical trials compare several treatments to a common control treatment in a single trial with treatments dropped at interim analyses if, based on observed data, they are not sufficiently promising.Such designs have been used by, for example, MacArthur et al. (2013) and Barker et al. (2014).The approach can yield sample size reduction and administrative savings relative to running several twoarm trials, but presents challenges in terms of statistical analysis similar to those of post-model selection inference (Efron, 2014).
Proposed analysis methods have mainly focused on frequentist hypothesis tests (Thall et al., 1988(Thall et al., , 1989;;Stallard & Todd, 2003;Stallard & Friede, 2008;Magirr et al., 2012;Wason et al., 2017), with less work on estimation.Cohen & Sackrowitz (1989) consider two-stage designs with the treatment with the highest observed stage 1 average continuing to stage 2, with equal variances for the averages for each treatment in stage 1 and the stage 2 treatment.They derive an estimator of the stage 2 treatment mean that is uniformly minimum variance conditionally unbiased given the observed ordering of stage 1 treatment means.Bowden & Glimm (2008) extend the method to allow different variances for the treatment means and continuation to stage 2 of the s treatments with the largest observed stage 1 means.They provide expressions for uniformly minimum variance conditionally unbiased estimators for the means for these s treatment arms, again conditioning on the ordering of the stage 1 averages.Like Cohen & Sackrowitz (1989), they do not consider estimation relative to a control group.Kimani et al. (2013), Bowden & Glimm (2014) and Robertson et al. (2016) derive conditionally unbiased estimators for the difference between selected treatments and a control in multi-arm multi-stage trials.Kimani et al. (2013) allow for stopping for futility at stage 1 of a two-stage trial assuming common variances for averages in different arms in stage 1, while Bowden & Glimm (2014) and Robertson et al. (2016) allow for different variances in different arms and correlation between these respectively.Bowden & Glimm (2014) also allow for more than two stages.As discussed below, the Kimani et al. (2013) and Robertson et al. (2016) estimators are conditionally unbiased but not generally minimum variance.We obtain minimum variance conditionally unbiased estimators in these settings.c 2018 Biometrika Trust This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/ licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
These methods present estimators that are unbiased conditional on selection of the treatment arms with the largest observed stage 1 averages, possibly with the additional condition that these means are sufficiently larger than that for the control.In practice, other selection rules may be used, such as in § 3 below.We show how uniformly minimum variance conditionally unbiased estimators may be obtained for comparisons with the control conditioning on a treatment not being dropped from the trial using any specified rule for selection or stopping for futility.

2•1. A uniformly minimum variance conditionally unbiased estimator
Consider a multi-arm multi-stage clinical trial with up to r stages where, in stage j, patients are randomized to a control treatment, treatment 0, or experimental treatments, labelled i (i ∈ j ), with 1 ⊇ • • • ⊇ r .Without loss of generality, label treatments such that j = {1, . . ., s j } for some s j ( j = 1, . . ., r).Denoting the total number of experimental treatments, s 1 , by k, let r i = max{ j : s j i} (i = 1, . . ., k), so that treatment i is included in stages 1, . . ., r i , and set r 0 = r.
Let X ij denote the stagewise average for treatment i in stage j (i = 0, . . ., k; j = 1, . . ., r i ), with Assume that the X ij are jointly sufficient for μ 0 , . . ., μ k , with X ij ∼ N (μ i , τ −1 ij ) independent with τ ij known.Other cases, such as in § 3 below, may use normal approximations or estimated variances.Experimental treatments are selected to continue, along with the control, to stage j + 1 depending on X j according to some prespecified rule.Some possible rules are discussed below.
Define θ i = μ i − μ 0 (i = 1, . . ., k).We wish to estimate θ i (i = 1, . . ., s r ).In particular, we will obtain uniformly minimum variance conditionally unbiased estimators of θ 1 , . . ., θ sr conditional on the event, Q, that treatments 1, . . ., s r are selected to continue to the end of the trial according to the prespecified selection rule. Let The Appendix shows that θ k+1 is orthogonal to θ i , in that the information matrix term i θ i θ k+1 = 0 (Cox & Reid, 1987) (i = 1, . . ., k) and Z is complete and sufficient for θ = (θ 1 , . . ., θ k+1 ) T .Let Y i = X ir − X 0r (i = 1, . . ., s r ); then Y i is unbiased for θ i and, since Y i is independent of X r−1 and hence of Q, is also conditionally unbiased for θ i given Q.Thus, by the Rao-Blackwell theorem, E(Y i | Z, Q) is a uniformly minimum variance conditionally unbiased estimator for θ i given Q (i = 1, . . ., s r ).
Since, given the specified selection rule, the event Q depends only on X r−1 , with the integrals taken over the region Q corresponding to those X r−1 for which treatments 1, . . ., s r will be selected to continue to stage r.The denominator is the probability of Q given Z, which will be denoted by pr and, for i | = i and any j, j , X ij is independent of X i j and Z i , we have that where φ n (X ; μ, ) denotes the n-dimensional multivariate normal density with mean μ and variance matrix , evaluated at X , with μi = (Z i , . . ., Z i ) T and Thus (1) gives An important special case arises when the selection rule does not depend on X 0• , the observed averages for the control arm.In the integral over the event Q, the range of integration with respect to X ij (i = 1, . . ., k; j = 1, . . ., r i ) then does not depend on X 0• and the integration with respect to the elements of X 0• , i.e., X 01 , . . ., X 0,r−1 , is over the whole real line.Thus In this case the uniformly minimum variance conditionally unbiased estimator for θ i is the difference between the uniformly minimum variance conditionally unbiased estimator of μ i , which can be calculated ignoring the observed value of X 0• , and Z 0 , the usual uniformly minimum variance unbiased estimator of μ 0 , which does not depend on the selection.The integrals in (3) generally cannot be evaluated using standard functions.A numerical approach is Monte Carlo integration with rejection sampling, simulating X r−1 from (2), its conditional distribution given Z, accepting those in Q for the specified selection rule.This approach can be used for any selection rule with treatments proceeding to stage j + 1 dependent on X j .

2•2. Selection of the best-performing treatment in a two-stage trial
Much previous work on conditionally unbiased estimation in clinical trials with treatment selection has focused on the special case of r = 2 and s 2 = 1, with the experimental treatment having the largest observed stage 1 mean continuing along with the control to stage 2. Thus Q is the event X 11 > m with m = max(X 21 , . . ., X k1 ).As Q is independent of X 0• , the uniformly minimum variance conditionally unbiased estimator for θ 1 is (4).Since treatments 2, . . ., k are observed in stage 1 only, we have Z i = X i1 (i = 2, . . ., k) so that the numerator and denominator in the fractional term in (4) are respectively 1 φ( Z1 ) (Todd et al., 1996) and 1 − ( Z1 ) respectively, where , confirming this to be the difference between the uniformly minimum variance conditionally unbiased estimator for μ 1 ignoring the control treatment given by Bowden & Glimm (2008)  early.In this case, Q is the event X 11 > max(m, X 01 + c).Since this depends on X 01 , the form (3) must be used rather than (4) to give the uniformly minimum variance conditionally unbiased estimator of θ 1 .The integral in (3) is taken over X 11 and X 01 , but may be rewritten in terms of X 11 and (X 11 − X 01 ), noting that with v 1 as above and ) is a truncated bivariate normal expectation and can be evaluated using Rosenbaum (1961) to obtain the uniformly minimum variance conditionally unbiased estimator where T standard bivariate normal with correlation ρ.This is not the estimator proposed by Kimani et al. (2013), which is conditionally unbiased but not minimum variance.Robertson et al. (2016) also consider two-stage trials with early stopping for futility, though they do not assume equal variances and consider ranking by standardized observed stage 1 treatment effect estimates, In constructing their estimators, they condition on statistics based on the observed treatment effects, X ij − X 0j (i = 1, . . ., k; j = 1, 2).Since these may not be sufficient, for example in the case of normal data with a common control, their estimator is also conditionally unbiased but not minimum variance; indeed they show that in the common variance case it has larger variance than the estimator proposed by Kimani et al. (2013).

NUMERICAL EXAMPLE
The ADVENT trial (MacArthur et al., 2013) was a two-stage study comparing 125 mg, 250 mg and 500 mg doses of crofelemer with a placebo in noninfectious chronic diarrhoea in HIV-seropositive patients.The primary endpoint was clinical response, defined as at most two watery stools per week during at least two of four weeks of treatment.At the end of stage 1, based on data from 200 patients randomized equally between the four treatment arms, a single dose of crofelemer would continue with the control to stage 2, with a further 150 patients randomized equally between these two groups.In the absence of safety concerns, the dose selected would be the lowest dose with an observed clinical response rate within two percentage points of the best-performing dose.Although not explicitly stated by MacArthur et al. (2013), we assume that the trial would have stopped at the first stage if the best-performing dose did not have observed clinical response rate at least two percentage points above the placebo.The trial was analysed using the method of Posch et al. (2005) to control the familywise Type I error rate, but apparently no attempt was made to obtain unbiased estimators of the treatment effect for the selected dose.The U.S. Food and Drug Administration (2012) report gives results of the two stages of the study.In stage 1, 50, 44, 54 and 46 patients received the placebo and the three doses respectively, with 1, 9, 5 and 9 patients in these groups showing clinical response.In stage 2, 88 and 92 further patients received the placebo and 125 mg crofelemer respectively, with 10 and 15 demonstrating clinical response.A naive estimate of the effect of the 125 mg dose relative to placebo is thus 24/136 − 11/138 = 0•097.1. Simulated probability of selection and bias (with root mean squared error in parentheses), with all values multiplied by 100, for the estimator given by (3) and the naive estimator for θ i conditional on selection of treatment i (i = 1, . . ., 3) for a range of μ values (μ 0 = 0•02 in all cases) For illustrative purposes, we treat estimated event rates as asymptotically normally distributed with variances based on the observed responses in each group and stage and set (X 01 , . . ., X 31 ) = (0•02, 0•2045, 0•0926, 0•1956), (τ 01 , . . ., τ 31 ) = (2551, 270•4, 642•1, 292•3), (X 02 , X 12 ) = (0•1136, 0•1630) and (τ 02 , τ 12 ) = (873•7, 674•2).Based on these results, we obtained an estimate from (3), calculating the integrals using 100 000 simulations conditional on 125 mg being the lowest dose with observed clinical response rate within two percentage points of the best-performing dose, giving an estimate of 0•114.
The naive estimator is conditionally biased, overestimating the true effect.The bias is relatively small, but in some cases it is, like the difference between the naive and new estimates for the ADVENT data reported above, close to the difference in clinical response rates of two percentage points considered important in the trial design.Only settings with μ where the highest dose is nearly always selected, have bias near zero.The estimator (3) has bias near zero, though given that the derivation is based on assumed normality with known variance, it may not be unbiased; the simulated bias is negative in all cases, suggesting that the true treatment effect is slightly underestimated.The root mean squared errors of the two estimators are similar, though with some suggestion that the naive estimator has slightly smaller mean squared error in situations in which it has larger bias, consistent with theoretical results in settings in which the single most promising treatment is selected (Stallard et al., 2008;Bauer et al., 2010).Given the simulation results, it is interesting that the estimate from (3) using the observed ADVENT trial data is larger than the naive estimate.This appears to be due to the large difference between X 01 and X 02 , the placebo arm means in the two stages.A full analysis might involve investigation of possible causes for this difference.
With no selection, i.e., s r = k, Z i and Z i − Z 0 are the usual uniformly minimum variance unbiased estimators for μ i and θ i (i = 1, . . ., k).By the factorization theorem, sufficient statistics are the same for the conditional and unconditional likelihoods, so Z is also sufficient in this case.