Adaptive clinical trial designs with blinded selection of binary composite endpoints and sample size reassessment

Summary For randomized clinical trials where a single, primary, binary endpoint would require unfeasibly large sample sizes, composite endpoints (CEs) are widely chosen as the primary endpoint. Despite being commonly used, CEs entail challenges in designing and interpreting results. Given that the components may be of different relevance and have different effect sizes, the choice of components must be made carefully. Especially, sample size calculations for composite binary endpoints depend not only on the anticipated effect sizes and event probabilities of the composite components but also on the correlation between them. However, information on the correlation between endpoints is usually not reported in the literature which can be an obstacle for designing future sound trials. We consider two-arm randomized controlled trials with a primary composite binary endpoint and an endpoint that consists only of the clinically more important component of the CE. We propose a trial design that allows an adaptive modification of the primary endpoint based on blinded information obtained at an interim analysis. Especially, we consider a decision rule to select between a CE and its most relevant component as primary endpoint. The decision rule chooses the endpoint with the lower estimated required sample size. Additionally, the sample size is reassessed using the estimated event probabilities and correlation, and the expected effect sizes of the composite components. We investigate the statistical power and significance level under the proposed design through simulations. We show that the adaptive design is equally or more powerful than designs without adaptive modification on the primary endpoint. Besides, the targeted power is achieved even if the correlation is misspecified at the planning stage while maintaining the type 1 error. All the computations are implemented in R and illustrated by means of a peritoneal dialysis trial.


Introduction
Composite endpoints are frequently used in randomized controlled trials (RCTs) to provide a more comprehensive characterization of patients' response than when using a single endpoint.For example, Major Adverse Cardiovascular Events in cardiovascular disease, where the composite endpoint includes death, stroke, myocardial infarction, or revascularization is commonly used for time-to-event endpoints [1] and binary endpoints [2].The use of composite endpoints can also improve the power in situations where the incidence rates of the individual components are too low to achieve adequate power with feasible sample sizes and trial durations.The combination of several components into a composite endpoint provides then a solution by increasing the incidence rate of the primary endpoint.However, using composite endpoints comes with a cost.The interpretation becomes more complex, especially when components have different effect sizes and different event probabilities.Moreover, if the treatment has only an effect in some components, the effect size of the composite will be diluted.When a composite endpoint is used as primary endpoint, regulatory agencies require to analyse in addition all components separately as secondary endpoints [3,4,5].In particular, it is necessary to assess the effects of the most relevant component under study.When designing a trial with a composite endpoint, sample size calculation is especially challenging since it requires the anticipation of event probabilities and effect sizes of the components of the composite endpoint as well as the correlation between them.While the marginal effect size of each component are usually known, the correlation is often not reported.
In the context of peritoneal dialysis, the binary composite endpoint major adverse peritoneal events (MAPE) has been recently proposed [6].This endpoint combines three individual components: (1) peritonitis, (2) peritoneal membrane deterioration, and (3) technical failure; where peritonitis and peritonitis membrane deterioration endpoints are considered clinically more relevant.Given that this composite endpoint is relatively new, there is only limited data as basis for sample size calculations available.So under which circumstances is it best to consider the composite endpoint MAPE in terms of power of the trial?Or how could we design the trial robustly to possible deviations from the anticipated correlation?In this work, we aim at addressing both questions.We propose a design in which the decision of whether it is better to consider the composite endpoint or its most relevant component as the primary endpoint is reevaluated by choosing the endpoint with the smaller required sample size.Based on this choice, the sample size is recalculated, incorporating correlation information estimated at an interim analysis if necessary.Adaptations to endpoint selection and, in particular, designs that allow adaptive modification of the primary endpoint based on interim results are discussed in the Food and Drug Administration guidance on adaptive designs [3,7].Regulatory agencies require the adaptation rule to be planned before the data become available and the use of appropriate statistical methods to ensure that the type 1 error is controlled.
In trials with multiple endpoints of interest, the testing strategy can either be based on a single endpoint (and thus consider the rest as secondary endpoints), combining all the endpoints in a composite endpoint, or considering a multiple test using all the endpoints.The choice of the primary composite endpoint based on the trial's efficiency has been addressed by several authors.Lefkopoulou and Ryan [8] compared the use of multiple primary endpoints to a composite endpoint by means of the Asymptotic Relative Efficiency (ARE) between the corresponding hypothesis tests.Gómez and Lagakos [9] and Bofill Roig and Gómez Melis [10] proposed the ARE as a method to choose between a composite endpoint or one of its components as primary endpoint for comparing the efficacy of a treatment against a control in trials with survival data and binary data, respectively.Sozu et al. [11] evaluated the efficiency of the trial depending on the number of endpoints considered.
Several authors have proposed different approaches to size trials with several endpoints as primary.Sozu et al. [12] discussed sample size formulae for multiple binary endpoints.As it is known, a major difficulty in the sample size calculation is that sometimes the required information depends on nuisance parameters or highly variable parameters.In trials with multiple endpoints, the required sample size depends on the correlation among the considered endpoints and needs to be taken into account in sample size calculations [3,4].However, the correlation between endpoints is usually unknown and often not reported in the literature which can be an obstacle for sound trial design.Several authors showed, that the correlation has a large impact on the required sample size when using multiple co-primary and composite binary endpoints [12,13].One way to address this problem may be to consider an interim analysis to estimate unknown parameters, in particular, the correlation.Existing work in this context has mainly focused on trials with multiple endpoints.[14] approached the sample size calculation of trials with multiple, correlated endpoints.They proposed estimators for the covariance and the correlation based on blinded data obtained at an interim analysis.[15] considered trials in which the composite endpoint and its most relevant component are two primary endpoints.They proposed an internal pilot study design where the correlation between the statistics for the composite endpoint and the most relevant component is estimated in a blinded way at an interim stage and where the sample size is then revised accordingly.Surprisingly, less attention has been given to the estimation of the correlation between the components of composite endpoints per se and sample size reassessment in trials with primary composite endpoints.
In this paper, we propose a trial design that allows an adaptive modification of the primary endpoint based on blinded information obtained at an interim analysis and recalculates the sample size accordingly.If the primary endpoint is decided to be the composite endpoint, then the sample size reassessment incorporates the information of the estimated correlation.We focus on a two-arm RCT with a primary composite binary endpoint defined by two components, of which one is considered clinically more relevant.In Section 2, we present the problem setting and our main objectives.In Section 3, we propose the adaptive design with endpoint modification.We first introduce the decision rule used to adaptively select the primary endpoint.Then we discuss how this decision rule is computed based on blinded data and the subsequent sample size recalculation.In Section 4, we extend the proposed design for trials with composite endpoints of more than two components and more than two arms.In Section 5 we apply our methods to Peritoneal Dialysis trials.Furthermore, in the supplementary material, we present an R package in which the methodology has been implemented and include an additional example in the context of cardiology trials in which the R code is provided as a tutorial.We performed a blinded selection of the primary endpoint using the observed data from a conducted trial.In Section 6, we evaluate the operating characteristics of the adaptive design.We finish with a short discussion.
The R code to implement the proposed methods and reproduce the results of this article is available at https://github.com/MartaBofillRoig/eselect.

Notation, hypotheses and trial designs
Consider an RCT designed to compare two treatment groups, a control group (i = 0) and an intervention group (i = 1), each composed of n (i) individuals, and denoting by n = n (0) + n (1) the total sample size and by π = n (0) /n the allocation proportion to the control group.Assume two events of interest, say ε 1 and ε 2 , and assume that there is one event (say ε 1 ) which is more relevant for the scientific question than the other.Let X ijk denote the response of the k-th binary endpoint for the j-th patient in the i-th group (i = 0, 1, j = 1, ..., n (i) , k = 1, 2).The response X ijk is 1 if the event ε k has occurred during the follow-up and 0 otherwise.Let p (i) k represent the probability that ε k occurs for a patient belonging to the i-th group.Let denote the odds ratio for the k-th endpoint, where q Define the composite binary endpoint as the event that occurs whenever one of the endpoints ε 1 and ε 2 is observed, that is, ε * = ε 1 ∪ ε 2 .Denote by X ij * the composite response defined as: Let p (i) * be the event probability of the composite endpoint, p (i) * = P(X ij * = 1), and OR * be the odds ratio for the composite endpoint ε * .We denote by p(i) k the estimated probability of response for the k-th binary endpoint in group i, that is,

Trial design using the composite endpoint
Assume that initially the trial is planned with the composite endpoint ε * = ε 1 ∪ ε 2 as the primary endpoint.The hypothesis to be tested is the null hypothesis of no treatment difference in the composite endpoint H * : OR * = 1 against the alternative hypothesis of a risk reduction in the treatment group, K * : OR * < 1.
We test H * using the test statistic T * ,n , given by: This statistic is asymptotically N (0, 1) under H * and we reject the null hypothesis if T * ,n < z α , where z x denotes the quantile of the standard normal distribution [16].Then the sample size needed to achieve a power of 1 − β given a significance level α is of the joint distribution of the components is available, the distribution of the composite endpoint can be derived ( [13]).Specifically, the event probability of the composite endpoint in the i-th group, p (i) * , is determined by the probabilities of the components, p (i) 1 and p (i) 2 , and Pearson's correlation coefficient between the components, ρ, as follows: The odds ratio for the composite endpoint, OR * = OR * (p 2 , OR 1 , OR 2 , ρ), can be expressed as function of the odds ratios OR 1 , OR 2 , the event probabilities in the control group, p 2 , and the correlation ρ (see the supplementary material).Note, however, that in both cases, to compute p 2 , OR 1 , OR 2 , ρ), we make the underlying assumption that the correlation between the components is the same in the treatment and control groups.Although we focus on the correlation in this work, other association measures can be used instead.In the supplementary material, we present different association measures, such as the relative overlap and conditional probability, and establish the relationship between them and the correlation so that one can move from one to the other depending on what is easier to anticipate.Further details regarding the assumption of equal correlations across arms can be found in the supplementary material.
As a consequence, the required sample size N * (p (0) * , OR * ) can be computed based on p (0) * , given in (3), and OR * , given in equation ( 1) in the supplementary material.With a slight abuse of notation, we refer to the sample size computed by means of the components' parameters as N * (p

Trial design using the most relevant endpoint only
The null and alternative hypotheses related to the most relevant of the components, ε 1 , are H 1 : OR 1 = 1 and K 1 : OR 1 < 1.Similar to the composite design, let T 1,n be the statistic to test H 1 , defined by As above, T 1,n is asymptotically N (0, 1) under H 1 , and the null hypothesis 1 , OR 1 ) required to achieve a power of 1 − β at a one-sided significance level of α is given by (2) replacing p (0) 1 and OR 1 by p (0) * and OR * , respectively.
3 Adaptive design with endpoint modification

Decision rule based on the ratio of sample sizes
We propose a trial design that allows modifying adaptively the primary endpoint based on blinded information obtained at an interim analysis or at the end of the trial.The decision rule to select the endpoint to be used as the primary endpoint chooses the endpoint with the lower estimated required sample size.Let d(•) denote the ratio of the required sample size for each of the designs, given by where N 1 (•) and N * (•) are the sample sizes for the relevant and composite endpoints introduced in Subsections 2.1 and 2.2, respectively.Note that this ratio depends also on α and β.Now, the decision rule to select the primary endpoint is as follows: If d(•) < 1, use the most relevant endpoint as the primary endpoint; if d(•) ≥ 1 the composite endpoint is chosen.

Estimation of the sample size ratio based on blinded data
In order to estimate the sample size ratio of the designs with the most relevant and the composite endpoint, we use the blinded data obtained either at the interim analysis or the end of the trial.Specifically, we derive estimates of the event probabilities of the components in the control group and their correlation.Besides the blinded (interim) data, the estimates are based on the a priori assumptions on the effect sizes.
Suppose that the blinded analysis, using the pooled sample, is based on a sample of size ñ, where ñ could be the total sample size initially planned (ñ = n) or a proportion of it used at an interim stage (ñ = ω • n, with 0 < ω < 1).Also, suppose that the proportion of patients assigned to the control group based on this sample is the same as the one expected at the end of the trial, that is, π = n (0) /n = ñ(0) /ñ, where ñ(0) is the sample size in the control group in the blinded data.Based on the observed responses in the pooled sample, we estimate the probabilities p 1 , p 2 , and p * , where p k = πp k for k = 1, 2, * and π = n (0) /n.Assuming that the expected effects for the components (OR 1 and OR 2 ) have been pre-specified in advance, we obtain estimates of the probabilities of each composite component under the control group p 2 and subsequently the estimates of the probabilities under the treatment group p 2 .Taking into account expression (3) and using the estimated probabilities for each composite component in each group (p 2 ) and the estimated pooled event probability of the composite endpoint (p * ), the correlation is estimated by 2 where q(i) k = 1 − p(i) k , and ñ(i) is the sample size in group i in the blinded data.Based on these estimates we then compute the sample size ratio d(p to select the endpoint.The diagram in Figure 1 exemplifies the adaptive design if initially the composite endpoint is chosen as the primary endpoint.Note that in order to calculate the initial sample size for the composite endpoint, assumptions regarding the parameters' values determining the sample size have to be made.

Sample size reassessment
After the endpoint has been selected based on the estimates p(0) 1 , p(0) 2 , ρ, evaluated from the blinded data, in addition the sample size can be recalculated.When the composite endpoint is selected, the target sample size, computed from the above estimates and based on the pre-specified effect sizes OR 1 , OR 2 , is given by N * (p Because the overall sample size cannot be smaller than the number of already recruited patients, the sample size reassessment rule is given by 2 , OR 1 , OR 2 , ρ)} where ñ denotes the number of patients recruited so far.
If, in contrast, the most relevant component is chosen as primary endpoint, the sample size can be reassessed to aim at a power of 1 − β for this endpoint.The sample size calculation is based on the pre-specified effect size OR 1 and the estimated event probability p(0) 1 .Thus, in this case the sample size reassessment rule is given by n a = max{ñ, N 1 (p If the selection is made at the interim analysis, ñ < n and therefore the recalculation could result in a reduction of the initially planned sample size.In contrast, if the selection is made at the planned end of the trial, ñ = n, the sample size can either remain unchanged or can be increased if required.

Considerations for choosing the timing of the interim analysis
As usual in adaptive trials, the timing of the interim analysis has to be fixed independently of the observed data and described in the trial protocol.For the proposed design, a reasonable strategy is to consider as initial sample size the minimum between the sample size for the relevant endpoint and the composite endpoint assuming a correlation of 0, that is, For correlation equals zero, the required sample size for the composite endpoint is the smallest (assuming that only non-negatively correlated components are possible) [13].Therefore, a reasonable strategy would be to fix the design as follows.First, conduct the selection of the endpoint based on blinded data after ñ subjects.Then, reassess the sample size according to the rule defined in Section 3.3.If the reassessed sample size is smaller than ñ, stop the trial and conduct the final (unblinded) analysis of the data.Otherwise, expand the trial with further subjects as needed and conduct the final (unblinded) analysis of the selected endpoint n a .The maximum sample size is bounded by the maximum sample size coming from the sample size calculation for the relevant endpoint and composite endpoints assuming the largest possible correlation.
4 Extension to more than two components and more than two arms In this section, we address the recursive selection of the primary endpoint for more than two components, and discuss the extension to more than two arms.

Composite endpoints with more than two components
Consider now a trial with K potential endpoints of interest.We assume that they differ in importance and can be ordered according to their importance.Let ε 1 , • • • , ε K denote the endpoints ordered by decreasing importance.Let p k and OR k denote the event probabilities in the control group and the effect size for the endpoint ε k (k = 1, ..., K).In the planning phase of the RCT, assumptions on the event probabilities, effect sizes, and correlations values are made to obtain an initial sample size estimate.
The procedure to select the primary endpoint and recalculate the sample size accordingly for K components is based on the following algorithm: Step 1: Compare the required sample size for the endpoint ε 1 and the composite of the first and second endpoints, ε * ,2 = ε 1 ∪ ε 2 and compute the sample size ratio based on the estimated probabilities and assumed effect sizes, p(0) Steps i = 2, ..., K − 1: Compare the efficiency of using ε * ,i over ε * ,i+1 = ε * ,i ∪ ε i+1 .
) ≥ 1, then compute the parameters of the composite endpoint ε * ,i+1 and go to Step i + 1.Otherwise, select ε * ,i and go to Step K.
Step K: Reassess the sample size based on the selected endpoint.
Using this recursive method, we only need the anticipated values of event probabilities in the control and effect sizes of the components (ε 1 , • • • , ε K ).If the composite endpoint is selected in the step i, this endpoint is considered as a component for the composite considered in the next step.For this reason, the corresponding parameters are recalculated and considered as anticipated values of the components in the next iteration.

Trials with more than two arms
Consider a multi-armed RCT comparing the efficacy of M treatments to a shared control treatment using the binary composite endpoint ε * = ε 1 ∪ ε 2 .We test the M individual null hypotheses H Denoting the test statistics (1) to compare treatment m against control by T (m) * ,n , as before we have that asymptotically T (m) * ,n ∼ N (0, 1).We reject the null hypothesis if T (m) * ,n < z α/M , adjusting the threshold to account for the multiplicity of treatment arms.To size the trial, suppose that the expected effect size for the components is the same in all treatment arms, that is, OR k = OR (m) k for all m (k = 1, 2).Additionally, as we did before, assume that the correlation between the components is equal across arms, ρ = ρ (m) for all m.Note that this implies that OR * = OR (m) * for all m.For each individual comparison, the sample size is N * (p (0) * , OR * ) as described in Section 2, and as the trial considers a shared control the total sample size for the trial is: N * ,M (p where π is the allocation proportion to the control group.The sample size for the multi-armed RCT can then be determined by means of the same set of parameters (p For the most relevant endpoint, the null and alternative hypotheses for treatment m are H and reject 1,n < z α/M .Assuming the effect sizes to be equal across arms, OR 1 = OR (m) 1 , the total sample size for the trial would be N 1,M (p The sample size ratio d(•) is then reduced to the same as used in (5), and the adaptive design proposed in Section 3 can then be applied analogously as for the case of a two-armed trial.Hence if d(•) > 1, the design for testing the efficacy using the most relevant endpoint(s) is chosen, otherwise the composite endpoint is, and in either case we recalculate the sample size using the event probability and the correlation estimates.As the same effects are assumed for all arms, the same procedure can also be used to estimate the probabilities under the treatment group and the correlation.This assumption allows the estimates to be blinded and permits the selection of the primary endpoint to be the same for all arms.However, if we relax these assumptions, it could result in different selection strategies, e.g., maximizing the minimum power across all arms or partly unblinding the data (blind pooling treatment data if arms are not finishing at the same time like in multi-arm platform trials).

Motivating example in Peritoneal Dialysis Trials
Consider a trial in peritoneal dialysis with the primary endpoint major adverse peritoneal events (MAPE), defined as the composite endpoint of peritonitis and peritoneal membrane deterioration (ε 1 ) and technical failure (ε 2 ).It has to be noted the MAPE initially consists of three components, but we grouped peritonitis and peritoneal events together for the sake of illustration.Also, suppose that the endpoint of peritonitis and peritoneal membrane deterioration can be considered as the most relevant endpoint that could serve as sole primary endpoint.Table 1 summarizes the considered endpoints.
[6] reported event probabilities of the individual endpoints and combinations thereof.We use these estimated event probabilities as estimates for the event probabilities in the control group at the design stage of the trial (see Table 1).We discuss the efficiency of using MAPE (ε * = ε 1 ∪ ε 2 ) over the endpoint of peritonitis and peritoneal membrane deterioration (ε 1 ) alone, and illustrate the design with adaptive selection of the primary endpoint at the interim analysis and sample size reassessment.
In Figure 2 (a), we depict the sample size required for MAPE with respect to the correlation between ε 1 and ε 2 , and the sample size if only using ε 1 , both based on the parameters assumed at the design stage (Table 1).We can observe that the sample size increases with respect to the correlation.In Figure 2 (b), we show the power of the trial when using a fixed design with the endpoint MAPE, ε * , as primary endpoint, assuming that the correlation equals 0, a fixed design with the most relevant endpoint ε 1 , and when using the proposed adaptive design.We notice that the adaptive design allows to maintain the power of the trial at 0.80 and is superior to the power obtained when using the fixed design.The decision rule of the adaptive design is such that it selects the endpoint that requires the smallest estimated sample size.Furthermore, if this sample size does not result in the desired power, it is readjusted based on information from the interim analysis.So when the estimated correlation is lower than 0.2, the adaptive design typically selects the composite endpoint as primary endpoint and recomputes the sample size using the estimated correlation.When the estimated correlation is larger or equal than 0.2, then the most relevant endpoint is selected and the sample size is reassessed accordingly.
6 Simulation study

Design and main assumptions
We simulate the statistical power and significance level under different scenarios and consider two-arm RCTs with two binary endpoints and parameters as given in Table 2.The correlation between the endpoints is assumed to be equal for both groups.Since the range of possible correlations depends on (p , scenarios in which the correlation is not within the valid range are discarded.
We compare the actual type 1 error rate and power of the proposed adaptive design with fixed designs using the relevant or composite endpoints as primary endpoint.Specifically, we consider the following designs: • Adaptive design: trial design whose primary endpoint is adaptively selected between the composite and the most relevant endpoint based on blinded data; • Composite endpoint design: trial design without adaptive modification of the primary endpoint.The primary endpoint is the composite of ε 1 and ε 2 .
• Relevant endpoint design: trial design without adaptive modification of the primary endpoint.The primary endpoint is the most relevant endpoint (ε 1 ); We differentiate between two types of designs: those with selection of the components of the composite endpoint at the end of the study and those with selection at interim analysis.In the first, the selection is based on blinded data at the pre-planned end of the trial, using the total sample size planned at the design stage.In the second, we select the primary endpoint based on blinded information obtained at an interim analysis after 50% of the observations are available.For these designs, we consider designs with and without sample size recalculation after the interim analysis.
In trials with endpoint selection at the end of the study or at interim but without recalculation of sample size, the planned sample size n is calculated to have 0.80 power to detect an effect of OR 1 on the most relevant endpoint at significance level α = 0.05.We use this sample size for the three designs being compared.Therefore, the composite endpoint in this case is intended to be used only if it leads to an increase in power to the study.On the other hand, the (initial) sample size n for those trials with sample size reassessment is calculated to have 0.80 power to detect an effect of OR * on the composite endpoint at significance level α = 0.05, where p (0) * , OR * used for the sample size calculations are computed based on the components' parameters (p 2 , OR 1 , OR 2 ) and assuming correlation equal 0. Therefore, in this case, the adaptive design serves to readjust the values anticipated in the design for the composite endpoint if the components are correlated, and to compare the efficiency of the design compared to its most relevant component, and thus to change the primary endpoint if the composite endpoint is less efficient.We summarize in Table 3 the trial designs considered for the simulation study.
For each combination (p 2 , OR 1 , OR 2 , ρ), we simulated 100000 trials of size n according to each design (adaptive design, composite endpoint design, and relevant endpoint design).To evaluate the power, we considered the alternative hypothesis H 1 in which OR 1 , OR 2 < 1 (and therefore OR * < 1).We simulated based on the values assumed in the design for OR 1 , OR 2 and the resulting OR * computed based on the parameters (p To evaluate the type 1 error rate, the same set of scenarios were considered as for the power in terms of the values used for the sample size calculation but we simulated under the global null hypothesis H 0 so OR 1 = OR 2 = 1 (and therefore OR * = 1).The total number of scenarios is 1166.

Selection at the end of the trial
As expected, for the scenarios under the alternative hypotheses the powers when using the most relevant endpoint have mean 0.80, as the sample sizes were calculated for this endpoint.The powers when using the composite endpoint range from 0.60 to 1.00 with mean 0.85.With the adaptive design, the powers take values between 0.80 and 1.00, and have mean 0.88.Results are summarized in Figure 4.
To illustrate the properties of the adaptive design, consider a specific scenario (see Figure 3).For a given combination of combination of (p 2 , OR 1 , OR 2 ), we plot the empirical power for each design (adaptive design, composite endpoint design, and relevant endpoint design) for different correlations ρ.The colors in the power plots indicate which endpoint is optimal for the given parameters p From there, we observe that when the power for the composite endpoint design is greater than 0.80 regardless of the correlation value, the decision in the adaptive design is to use the composite.Likewise, if the composite design's power is less than 0.80, the relevant design will be chosen.Also note that the decision rule, i.e., the ratio of sample sizes in (5), decreases with respect to the correlation.This is due to the sample size for composite endpoints increasing as the components are more correlated.Indeed, for a given set of marginal parameters (p , the composite design is more efficient the lower the correlation.Therefore, when using the adaptive design, the decision rule chooses the composite endpoint when the estimated correlation between the components is small and chooses the most relevant endpoint when the estimated power using the composite falls below 0.80.Thus, the power of the adaptive design is always greater than 0.80.In the supplementary material we plot the empirical power for each design as function of the correlation ρ for all scenarios considered in the simulation.For the scenarios simulated under the global null hypothesis (i.e., OR * = OR 1 = OR 2 = 1), all designs control the type 1 error rate at the nominal level α = 0.05.

With sample size reassessment
The initial sample size in these settings was computed to detect an effect on the composite endpoint, assuming uncorrelated components (ρ = 0).For the relevant endpoint design, the powers in this case range from 0.33 to 0.85 with mean 0.64; and when using the composite endpoint range from 0.60 to 0.80 with mean 0.72.For the adaptive design, in contrast, the powers have mean 0.80 (see Figure 4).The proposed adaptive design, therefore, ensures that the target power is achieved, either by keeping the composite endpoint as primary but correcting the correlation value assumed in the design and recalculating the sample size accordingly in the interim analysis, or by modifying the endpoint to the most relevant endpoint and adjusting the corresponding sample size.To illustrate the properties of the adaptive design we again focus on a selected scenario (see Figure 3).For the other considered cases see the supplementary material.We observe that when using the adaptive design, the power is always maintained at 0.80, while for the composite endpoint design it depends on the true value of the correlation and the extent to which it deviates from the correlation assumed at the design stage (which is, in our case, ρ = 0).On the other hand, the type I error rate is as well maintained at 0.05.

Without sample size reassessment
When using the adaptive design with endpoint selection at an interim analysis without sample size reassessment, the observed results are slightly worse to those obtained when selecting the endpoint at the end of the study as the estimates have a higher variability.The type 1 error rate under the null scenarios investigated is again well controlled (data not shown).

Comparison between blinded and unblinded estimators
In this work, we proposed an adaptive modification of the primary endpoint and sample size reassessment based on parameter estimates, estimated from the blinded (interim) data.Alternatively, the event probabilities in the control group and the correlation between endpoints can be estimated using the unblinded data (but still using the a priori estimates of the effect sizes).To assess the properties of this alternative approach, we simulated adaptive trials for the above scenarios with selection at the interim analysis or at the end of the trial, and without sample size assessment.The power of the adaptive design using unblinded data is equal to or slightly higher than when using blinded data (see the supplementary material).However, when evaluating the type 1 error we observe that when unblinded information is used there is an inflation of type 1 error when using a conventional frequentist test as defined in Section 1.If the selection should be done on unblinded data in an interim analysis more complex adaptive closed testing strategies [17] have to be used and the data cannot naively be pooled over stages.

Properties of the design if there is no treatment effect in some of the components
We additionally assessed the power of the designs in scenarios where i) there is no effect in the most relevant endpoint; and ii) there is no effect in the additional endpoint.In these settings the adaptive design is not the most powerful design: the power of the adaptive design is between the power using only the relevant and the composite endpoints (see the supplementary material).

Discussion
In this paper, we proposed an adaptive design that allows the modification of the primary endpoint based on blinded interim data and recalculates the sample size accordingly.The design selects either a composite endpoint or the endpoint with the most relevant component as the primary endpoint, based on the ratio of sample sizes needed in the corresponding designs to achieve a certain power.This ratio depends on the event probabilities in the control group and the effect sizes for each composite component, and the correlation between them.We presented estimators for the event probabilities and correlation based on blinded data obtained at an interim or the pre-planned final analysis and proposed to use them to compute the sample size ratio.The advantage of using blinded data is that the type 1 error rate is not inflated when performing the conventional frequentist tests for the selected primary endpoint at the end of the trial.In all null scenarios investigated no substantial inflation of the type 1 error could be observed.This was expected as both the selection and sample size reassessment were based on blinded data [18,19] and not the observed treatment effect directly.The results obtained from the proposed adaptive design are, therefore, in line with the requirements of regulatory agencies for adaptive designs with endpoint selection [7], since the adaptation rules for blinded endpoint selection are predefined in the design and the methods considered keep the type 1 error control.
If the selection is done at the end, we showed that the proposed design is more powerful than the fixed designs using the composite endpoint or its more relevant component as the primary endpoint in all scenarios considered in the simulation study.The simulations have shown that as long as the marginal effect sizes have been correctly specified, the power never falls below the nominal power.In addition, a re-estimation of the sample size has been proposed by adjusting the sample size at the interim stage to incorporate the estimated correlation and estimated event probabilities in the control group based on the assumed effect sizes.Since the correlation between the components is rarely known and therefore not usually taken into account when sizing a trial with composite endpoints, we want to emphasize that this sample size calculation could be useful even without adaptive modification of the primary endpoint.As in trials with composite endpoints, the required sample size increases as the correlation increases, we proposed to start the trial assuming correlation equals zero and recalculate the sample size accordingly based on the blinded data.If sample size reassessment is not considered, then the best results are achieved when the selection of the primary endpoint is made at the end of the study due to the smaller variability of the blind estimates.However, for consistency checks and to convince external parties such as regulators, it might be reassuring to have a second independent sample, that has not been used before to determine the endpoint.
We focused on the estimation of the correlation based on blinded data but also considered estimators based on unblinded data (see the supplementary material).We compared the operating characteristics of trial designs using blinded and unblinded correlation estimators.Power is slightly higher when using the unblinded estimator.However, it may lead to a substantial type 1 error inflation.Throughout this work, in both blinded and unblinded data cases, we assumed that correlations are equal across treatment groups.This assumption, although common, may in some cases not be satisfied.We discuss the implications of this assumption in terms of the design and interpretation, also an approach to tailor the proposed design to cases where the correlations are not equal in the supplementary material.To allow for unequal correlations and blinded selection, one has to fix the effect size not only for the components, but also for the composite endpoint.There is a trade-off by having fewer assumptions but more fixed design parameters.However, further empirical investigations are needed to evaluate how plausible it is that the equal correlation across arms assumption will not be met and the impact of different correlations on interpreting the effect of the composite endpoint.
In this paper, we consider trials with large sample sizes, so derivations of sample size calculations are based on asymptotic results.In the case of trials with small sample sizes, it should be noted that smaller sizes would result in lower precision in event estimates, which could affect the variable decision and sample size recalculation.Finally, we extended the proposed design for trials with more than two groups and more than two components.Further extensions can be considered by giving greater flexibility in terms of the selection of the primary endpoint (e.g.choosing different primary endpoints according to treatment arm) and considering platform designs where the treatment arms enter and leave at different times during the trial (and therefore interim analysis also at different times).Extensions to complex designs such as those mentioned above and designs with time-to-event endpoints are open to future research.

Supplementary Material
Supplementary material includes further derivations, discussion on extensions for unequal correlations across arms, introduction of other association measures, an overview of the R package, an additional example based on a conducted randomized trial in cardiology including the R code, and other results from the simulation study.The R code to reproduce the results of this article is available at https: //github.com/MartaBofillRoig/eselect.( Table 2: Settings for the simulation: (p 2 , OR 1 , OR 2 ) denote the parameters for the endpoints ε 1 and ε 2 , ρ is the correlation between the endpoints, ω is the percentage of initial sample size used for the estimation and decision rule computation, and α and 1 − β refer to the significance level and power.
Endpoints    1 Odds ratio for the composite endpoint Let X ij1 and X ij2 denote the responses of two binary endpoints for the j-th patient in the i-th group of treatment (i = 0, 1, j = 1, ..., n (i) ).Denote by X ij * the composite response defined as Denote by p (i) 2 and p the probabilities of observing each endpoint in the i-th group.Let OR 1 , OR 2 be the odds ratio for both endpoints, that is, i) the correlation between X ij1 and X ij2 in group i (i = 0, 1).The probability of the composite endpoint is given by and the odds ratio for the composite endpoint, OR * , can be expressed in terms of the odds ratios OR 1 and OR 2 , the probabilities under the control group, p 1 and p (0) 2 , and the correlation ρ as follows: 1+ρ (1) OR 1 OR 2 p (0) 1+ (2)

Estimation of the correlation
Next, we describe two different approaches for estimating the correlation between the components of the composite endpoint (ρ), where the correlation is assumed to be equal in the treatment and control groups.
As in the main paper, suppose we have a sample of size ñ, where ñ could be the total sample size initially planned (ñ = n) or a proportion of it used at an interim stage (ñ = p init • n, with 0 < p init < 1).Also, suppose that the proportion of patients assigned to the control based on this sample is the same as the one expected at the end of the trial, that is, π = n (0) /n = ñ(0) /ñ where ñ(0) is the sample size in the control group in the blinded data.

Blinded approach:
Let p k be the probability of observing the k-th endpoint in the pooled sample and pk be its estimate, that is: for k = 1, 2, * .
Based on the observed responses in the pooled sample, we estimate the probabilities p 1 , p 2 , and p * .Once the pooled estimates for each endpoint have been obtained, we calculate the estimated event probabilities per group.For this, we take into account that the pooled estimate over both groups is the weighted mean of the event probabilities per group, given by and p (1) . By using these equations, and plugging in the estimate pk and assuming the expected effects for the endpoints 1 and 2 (say OR 1 and OR 2 ) pre-specified in advance, we obtain estimates of the probabilities of each composite component under the control group, p(0) 1 , p(0) 2 , and subsequently the estimates of the probabilities under the treatment group, p(1) 1 , p(1) 2 .By this step-wise estimation procedure only blinded estimates and assumptions regarding the effect sizes are used to back calculate the event probabilities per group.
Taking into account equation (2.3) in the paper and using the estimated probabilities for each composite component in each group (p 2 ) and the estimated pooled probability of the composite endpoint (p * ), we get the following estimator of the correlation:

Unblinded approach:
Based on the observed responses in patients in the control group, we estimate the probabilities p 2 with their estimated values, we obtain an estimate of the correlation between the Endpoints 1 and 2 in the control group, ρ(0) .We do the same for the treatment group, and then we obtain an estimate of the correlation in the treatment group, ρ(1) .
3 On the assumption of equal correlations across arms

Interpretation of different correlations across arms
Under the null hypothesis of no treatment effect on none of the components of the composite, we assume that the distribution of the components is equal between arms.As the rationale for using composite endpoints is to evaluate if the new treatment under study reduces the probability of patients suffering from any of the events included in the composite, it is natural to assume that under the null the joint distribution between them is also the same in both arms.And hence, the assumption of equal correlations holds naturally.Under the alternative hypothesis, it could be the case that one of the components has a greater effect than the other component and that in some way this also alters the correlation between the components.
We assessed how the distribution of the composite endpoint relies on the assumption on equal correlations.For that, we looked at how the composite endpoint's distribution behaves when we fix the correlation in the control arm, ρ (0) , and vary the correlation in the treatment arm, ρ (1) .As the probability of the composite endpoint decreases as the correlation increases, the effect in terms of the odds ratio (in ( 2)) increases with respect to ρ (1) and hence the sample size for the composite endpoint decreases with ρ (1) .In the scenarios assessed (data not shown), both the effect and the sample size are very sensitive to changes in correlation.
Given that the objective when considering composite endpoint is to assess whether there is a reduction in patients having events considered in the composite, if the treatment effect is primarily driven by a change in the correlation between events rather than a reduction in events per se, the use of composite endpoint should be reconsidered.

Adaptive design under different correlations across arms
We evaluated the robustness regarding the assumption of different correlations in the proposed design with respect to the statistical power through simulations.We simulated two-arm trials as described in Section 7 of the manuscript.We considered two sets of parameters values for (p 2 , OR 1 , OR 2 ), assumed correlation in the control arm ρ (0) equal to 0.2 and varied the correlation in the treatment arm ρ (1) between 0 and 0.5.Figure 1 shows the power of the adaptive design with respect to the correlation ρ (1) .In all the cases considered, the reassessed sample size took values between the sample size computed by correctly assuming different correlations and the sample size assuming equal correlations.Hence, in terms of the decision, if for all the values of ρ (1) the sample size of the composite endpoint is smaller than the sample size for the relevant endpoint, the decision is not affected (as is the case of scenario 1).Otherwise, however, the estimated sample size in the interim stage may lead to wrong decisions and therefore the adaptive design selects an endpoint that might not lead to the most powerful trial (e.g., scenario 2).

Tailored design for trials with different correlations across arms
In situations where it is anticipated that different correlations between arms may arise, one could apply an adaptive design similar to the one proposed assuming the expected effect on the composite endpoint, OR * , is also anticipated in the planning stage.
The tailored adaptive procedure for different correlations is then defined by the following algorithm: (0) Set the initial values at the design stage: * ).Taking into account the expression (1), we obtained the estimated correlations per arm ρ(0) , ρ(1) (2) Sample size composite: We reassess the sample size for the composite endpoint by means of the sample size formula in (2.2) in the manuscript using the estimated event probability p(0) * and the expected effect OR * .Analogously, the sample size for the most relevant component is obtained by using p(0) 1 and OR 1 (3) Compute the decision rule: Based on the estimated sample size, we estimate the decision rule of the ratio of sample sizes, in (3.5) in the manuscript.
(4) Decision and sample size reassessment: We select the primary endpoint based on the decision rule as explained in the manuscript and reassess the sample size accordingly.
Note that to allow for different correlations and blinded selection, one has to fix the effect size not only for the components, but also for the composite endpoint.Hence, there is a trade-off by having fewer assumptions but more fixed design parameters.

Other association measures between binary endpoints
Pearson's correlation is the most common measure to quantify the degree of association between binary endpoints, there are, however, more intuitive alternative measures to define the association between two binary outcomes.In this section, we present the relative overlap and the conditional probability as different ways to measure the association between two binary endpoints.
We relate both measures with the correlation and we rewrite p (i) * in terms of each of them.In case it is easier to anticipate the association between the components of the composite endpoint using these measures, one can then use them and then the relationship between these measures and the correlation to anticipate the value of the correlation.

Relative overlap
The relative overlap in the i-th group of treatment is defined as the conditional probability of observing the two marginal events knowing that at least one of these events has occurred [1].This measure is evaluated as the ratio between the probability of the intersection, p (i) ∩ , and the probability of the composite endpoint, as follows: This measure quantifies the ratio of the intersection versus the union of having these two events.The relative overlap takes values between 0 and 1 and is bounded by: The relative overlap, RO (i) , can be expressed by means of the event rates, p where db is a 2 × 2 table with the event counts from the pooled (blinded) sample used to estimate the event probabilities; p0 e1 and p0 e2 refer to the event probabilities of the components in the control group assumed in the design stage (that is, p 1 and p (0) 2 ); OR1 and OR2 are the odds ratios for the components (OR 1 , OR 2 ), and alpha and beta are the type 1 and 2 errors used to calculate the sample size in respect of the one-sided tests (2.1) and (2.4) in the manuscript.The sample size and effect sizes for composite endpoints, necessary for the calculation of the decision rule, are computed using the R-package CompAREdesign.
The function eselectsim simulates trials with adaptive endpoint selection and sample size reassessment for composite binary endpoints with two components.The function uses the algorithm implemented in eselect to select the primary endpoint and recalculate the sample size.The function call is eselectsim(ss_arm, p0_e1, p0_e2, OR1, OR2, p0_ce, p_init = 1, H0_e1 = FALSE, H0_e2 = FALSE, SS_r = TRUE, alpha = 0.05, beta = 0.2) where p0 e1, p0 e2, OR1 and OR2, alpha and beta are the same arguments used in eselect; ss arm is the sample size per arm (assuming equal randomization); p init is the percentage of sample size used for estimating the probabilities in the control and the correlation and selecting the design; H0 e1 and H0 e2 indicate the simulations are performed under the null hypothesis for the composite endpoints, and SS r indicates whether the sample size is reassessed after the endpoint selection.
Other functions for designs using unblinded data instead of blinded data, as is explained in this supplementary material (see Section 2) and discussed in the simulation study in the manuscript, are available in the R package.

Example with R: Target-vessel revascularization in cardiology trials
In patients with coronary artery disease, the composite binary endpoint ε * of ischemia-driven target-vessel revascularization (ε 1 ) and death from cardiac causes or myocardial infarction (ε 2 ) has been considered for evaluating the efficacy and safety of different stents in cardiology trials.As an example, TAXUS-IV was a randomized controlled clinical trial (RCT) to investigate the safety and efficacy of a placlitaxel-eluting stent in a patient population with coronary artery disease [2].The primary endpoint was ischemia-driven target-vessel revascularization, considered the most relevant of the composite components, while the composite endpoint was considered a secondary endpoint.Subsequently, TAXUS-V trial was conducted to evaluate the efficacy of such stents in a patient population with more complex lesions than the one studied in the previous trial.In TAXUS-V trial [3], the same endpoints were considered.TAXUS-V was also a RCT with a total of n=1145 patients allocated 1:1 to the two treatment groups.The final result was that the primary endpoint (ε 1 ) was statistically significant (p=0.02) with observed rates of 0.173 and 0.121 for the two groups, respectively.

PART A: Re-analysis of TAXUS-V with adaptive endpoint selection based on blinded data with different odd ratios for the components
Here, we illustrate how the proposed adaptive design could have been used for adaptive selection of the primary endpoint before conducting the final analysis and unblinding of the data.To do so, we will use the R packages CompAREdesign and eselect, which one can install through CRAN and GitHub: library(CompAREdesign) devtools::install_github("MartaBofillRoig/eselect") For illustrative purposes, we consider the total number of patients included in primary analysis as the initial sample size, ñ = 1145, where the endpoint selection is performed.For the selection of the endpoint assumptions have to be made according to the expected event probabilities in the control group and respective effect sizes in terms of odd ratios.The values were fixed to p0 e1 = 0.18 and p0 e2 = 0.050, for the event probabilities of ε 1 and ε 2 , and consider expected effect sizes for the components of OR1=0.70 and OR2=0.90.
We used the blinded data collected at the end of the study in TAXUS-V for selecting the primary endpoint according the proposed adaptive design in Section 3 of the manuscript.In particular, we used the data in a blinded way to estimate the event probabilities in the pooled sample (see Section 2 of this supplementary material) and obtain estimates of the event probabilities in the control arm and correlation to compute the decision rule, as explained in Sections 3.1 and 3.2 of the manuscript.
The following 2 × 2 table summarizes the blinded data obtained at the end of TAXUS-V trial regarding the components of the composite endpoint, ε 1 and ε 2 : Patients with ischemia-driven Patients without ε 1 target-vessel revascularization ε 1 Patients with cardiac death 33 31 or myocardial infarction ε 2 Patients without ε 2 135 945 Table 1: Blinded data at the end of the trial for TAXUS-V.
Based on that data, we can use the function eselect to implement the adaptive selection as follows > eselect(db=data,p0_e1=0.18,OR1=0.70,p0_e2=0.05,OR2=0.9,alpha=0.05,beta=0.2) $SampleSize [1] 1582.689 $Decision [1] 0 where data refers to data in Table 1.The function returns the decision (Decision = 1, meaning the selected endpoint is the composite endpoint; and Decision = 0, meaning the selected endpoint is the relevant endpoint) and the sample size needed to test the primary hypothesis according to the decision to achieve 80% at α = 0.05 significance level.Hence, in this case the decision is to keep the relevant endpoint as the primary endpoint of the trial.If an adaptive sample size reassessment was part of the initial design, then the sample size would have to be increased to achieve the targeted power.Otherwise, if only the adaptive selection of the endpoint was part of the design, the final analysis could then be performed with unblinded data.For the latter case the results would not have changed compared with TAXUS-V, as the same primary endpoint was selected.

Part B: Endpoint selection assuming different effect sizes and correlations using TAXUS-V
By means of the function eselectsim, we simulate trials to evaluate the adaptive selection depending on different scenarios.As before we assume that the selection should be made at the end of the trial with a total sample of n = 1145 as observed in TAXUS-V.For the blinded endpoint selection, we consider the event probabilities in the control arm of the components assumed at the planning stage as note above and same expected effect size for ε 1 , and suppose different values for the event probability of the composite endpoint and the expected effect size of ε 2 .
Using the obtained values for the event probability of the composite endpoint, then we use eselectsim to evaluate which is the selected endpoint in the adaptive design.This function returns the decision (Decision = 1, meaning the selected endpoint is the composite endpoint; and Decision = 0, meaning the selected endpoint is the relevant endpoint) and the statistic to test the primary hypothesis according to the decision.
If the effect size of ε 2 is expected to be equal to OR 2 = 0.80, then obtain > eselectsim(ss_arm=n/2, p0_e1=0.18,OR1=OR1, p0_e2=0.05,OR2=0.80,p0_ce=p0_ce [1], p_init = 1, SS_r=F, alpha = 0.05, beta = 0.If we assume the probability rate of the composite endpoint is the one obtained assuming correlations smaller than 0.3, then the decision is to use the composite endpoint as primary endpoint, while if the correlation is assumed to be larger than 0.4, then the most relevant endpoint is selected. If, however, the expected effect size of ε 2 equals to OR 2 = 0.90, then the most relevant endpoint is selected regardless the value of the correlation between components > eselectsim(ss_arm=n/2, p0_e1=0.18,OR1=OR1, p0_e2=0.05,OR2=0.90, p0_ce=p0_ce [1], p_init = 1, SS_r=F, alpha = 0.05, beta = 0.   refers to designs with selection at the end of the trial without sample size recalculation for which the initial sample size was computed to have 0.80 power to detect effects on the relevant endpoint.Case (b) refers to those designs with selection at the interim analysis and with sample size recalculation for which the sample size was computed the have 0.80 power to detect effects on the composite endpoint assuming zero correlation between the components.Cases (c) and (d) refer to situations where one of the components had no effect and to designs with interim analysis and for which the initial sample size was computed to have 0.80 power to detect effects on the relevant endpoint.to the correlation between the components.In (a), trials are initially sized to detect an effect on the relevant endpoint and the ADs select the primary endpoint at the end of the trial; in (b), trials are sized to detect an effect on the composite endpoint and ADs select the primary endpoint at the end of the trial and subsequently recalculate the sample size.Tables on the right side shows the value of the decision rule computed using the parameters' values used for the simulation and the percentage of cases in which the composite endpoint is selected as the primary endpoint.Note that for the CD and RD, the primary endpoint is the composite endpoint (CE) and relevant endpoint (RE), respectively, for the AD, the primary endpoint changes depending on the correlation.

< 1
for each arm m (m = 1, • • • , M ), where OR (m) * denotes the odds ratio for the composite endpoint in the m-th treatment arm.

1 < 1 .
Consider the test statistics T (m) 1,n to compare treatment m against control, which is asymptotically N (0, 1) under H (m) 1

Figure 1 : 2 ,
Figure 1: Flow diagram of the adaptive design.The steps involved in adaptive design are illustrated in grey boxes.In the white boxes there are the necessary inputs, and explanations and outputs are in dotted white boxes.The R functions to compute the corresponding steps are on the right side (see Sect. 5 in the supplementary material).Here p (0) 1 , p (0) 2 , OR 1 , OR 2 denote the design parameters for the endpoints ε 1 and ε 2 and ρ is the correlation between ε 1 and ε 2 used for the calculation of the initial sample size, n; p(0) 1 , p(0) 2 denote the estimated event probabilities in control group for ε 1 , ε 2 and ε * = ε 1 ∪ ε 2 and pk is the estimated pooled event probability of ε k (k = 1, 2, * ) based on the blinded sample n; N 1 and N * denote the sample sizes for endpoint ε 1 and ε * (see Sect. 2.1 and 2.2), respectively; and d(•) is the decision function (see Sect. 3).

7. 1
Results for designs using unblinded data Non-effect on the second composite component

Figure 2 :
Figure2: Power under composite design (CD), relevant design (RD) and adaptive design (AD).Case (a) refers to designs with selection at the end of the trial without sample size recalculation for which the initial sample size was computed to have 0.80 power to detect effects on the relevant endpoint.Case (b) refers to those designs with selection at the interim analysis and with sample size recalculation for which the sample size was computed the have 0.80 power to detect effects on the composite endpoint assuming zero correlation between the components.Cases (c) and (d) refer to situations where one of the components had no effect and to designs with interim analysis and for which the initial sample size was computed to have 0.80 power to detect effects on the relevant endpoint.
Initial sample size when using the trial design with the most relevant endpoint of peritonitis and peritoneal membrane deterioration (RD), or with the composite endpoint Major Adverse Peritoneal Events (CD).

Figure 2 :
Figure 2: Sample size and power depending on the design and the correlation between the endpoint of peritonitis and peritoneal membrane deterioration (ε 1 ) and technical failure (ε 2 ).

Figure 3 :
Figure3: Power under composite design (CD), relevant design (RD) and adaptive design (AD) with respect to the correlation between the components.In (a), trials are initially sized to detect an effect on the relevant endpoint and the ADs select the primary endpoint at the end of the trial; in (b), trials are sized to detect an effect on the composite endpoint and ADs select the primary endpoint at the end of the trial and subsequently recalculate the sample size.Tables on the right side shows the value of the decision rule computed using the parameters' values used for the simulation and the percentage of cases in which the composite endpoint is selected as the primary endpoint.Note that for the CD and RD, the primary endpoint is the composite endpoint (CE) and relevant endpoint (RE), respectively, for the AD, the primary endpoint changes depending on the correlation.

Figure 4 :
Figure4: Power under composite design (CD), relevant design (RD) and adaptive design (AD).Case (a) refers to designs with selection at the end of the trial without sample size recalculation for which the initial sample size was computed to have 0.80 power to detect effects on the relevant endpoint.Case (b) refers to those designs with selection at the interim analysis and with sample size recalculation for which the sample size was computed the have 0.80 power to detect effects on the composite endpoint assuming zero correlation between the components.

Table 1 :
Endpoints in peritoneal dialysis.Event probability and odds ratios for peritonities and peritoneal membrane deterioration and Technical failure endpoints.Event probability and odds ratio for MAPE endpoint computed assuming zero-correlation between the components of the composite endpoint.

Table 3 :
Outline of trial designs considered for the simulation, including: the sample size specification for the initial calculation, whether it was based on relevant endpoint (RE) or composite endpoint (CE); and, in the case of the adaptive design, at which point in the trial the endpoint selection is made and whether sample size recalculation is considered.
* with ρ = 0) At the interim analysis Yes Based on RE (OR 1 ) No Supplementary material for "Adaptive clinical trial designs with blinded selection of binary composite endpoints and sample size reassessment" Marta Bofill Roig * 1 , Guadalupe Gómez Melis 2 , Martin Posch 1 , and Franz Koenig 1 1 Section for Medical Statistics, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna 2 Departament d'Estadística i Investigació Operativa, Universitat Politècnica de Catalunya, Barcelona, Spain 1 , OR 2 , OR * , ρ(0)(1) Blinded estimation of the event probabilities and correlations: Based on the responses in the pooled sample, we estimate the probabilities p 1 , p 2 , p * .Assuming that the expected effects (OR 1 , OR 2 , OR * ) have been pre-specified in advance, we obtain estimates of the probabilities under the control (p