Abstract

As with many chronic conditions, matching patients with schizophrenia to the best treatment option is difficult. Selecting antipsychotic medication is especially challenging because many of the medications can have burdensome side effects. Adjusting or tailoring medications based on patients’ characteristics could improve symptoms. However, it is often not known which patient characteristics are most helpful for informing treatment selection. In this paper, we address the challenge of identifying and ranking important variables for tailoring treatment decisions. We consider a value-search approach implemented through dynamic marginal structural models to estimate an optimal individualized treatment rule. We apply our methodology to the Clinical Antipsychotics Trial of Intervention and Effectiveness (CATIE) study for schizophrenia, to evaluate if some tailoring variables have greater potential than others for selecting treatments for patients with schizophrenia (Stroup et al., 2003, Schizophrenia Bulletin, 29, 15–31).

1 INTRODUCTION

Precision, or personalized, medicine aims to customize treatment decisions based on an individual's characteristics. For many mental health conditions, such as depression and schizophrenia, several medication options are available and responses to treatment vary across people. As our ability to collect large amounts of data on individuals’ health conditions increases through electronic health records and other sources, it becomes increasingly important to have tools for optimally selecting treatments for individuals. There are many statistical challenges in making personalized treatment decisions rigorous and evidence based. This paper addresses the challenge of identifying which baseline characteristics are useful for tailoring treatment selection.

For this work, we focus on medications for treating schizophrenia. Schizophrenia is a chronic and heterogeneous disorder characterized by hallucinations, delusions and disorganized speech and thought processes. The medications for schizophrenia often have varying side effects (Stroup et al., 2003). Both tolerability, the ability of the individual to endure medication side effects, and efficacy, the ability of the medication to treat symptoms, are important for managing schizophrenia. Using patient characteristics, including symptoms and past treatment, to determine which medication is best for each individual could improve the success of managing symptoms using medication. However, it is not clear how to identify and rank the tailoring variables most useful for selecting optimal treatments.

An individualized treatment rule (ITR) takes in a patient's characteristics and outputs a treatment decision for one time point. A dynamic treatment regime consists of a sequence of ITRs given patient history and prior treatment. There are a variety of statistical methods to construct ITRs and dynamic treatment regimes. These methods can be classified into two approaches: regression-based methods and value-search methods. Regression-based methods include Q-learning (Murphy, 2005; Watkins & Dayan, 1992), A-learning (Murphy, 2003), G-estimation (Robins, 2004), regret regression (Henderson et al., 2010), dynamic weighted ordinary least squares (Wallace & Moodie, 2015) and tree-based methods (Laber & Zhao, 2015). Regression-based methods estimate the conditional mean of the outcome given covariates and treatment, and then use the estimated mean function to indirectly construct the optimal treatment strategy. Extensions of these methods for variable selection with high-dimensional covariates have recently been proposed (Lu et al., 2013; Qian & Murphy, 2011; Shi et al., 2018; Song et al., 2015a,b; Tao et al., 2018; Zhu et al., 2017; Zhu et al., 2019).

Value-search methods consider a restricted class of regimes and directly compare the estimated value function of each of the regimes; the value function maps a regime to the expected health outcome of the entire population if everyone were to follow that regime. The optimal regime among the set of considered regimes is often defined as the regime with the best value function. These approaches have gained interest because of their ease of implementation and scientific interpretability. Most commonly, these approaches have been used to estimate regimes of the form ‘when to start treatment’. That is, applications have considered ‘monotonic’ treatment regimes such that once treatment is initiated, it is not changed or discontinued. For example, researchers may want to know when HIV patients should receive antiretroviral therapy based on their CD4 count, thus comparing treatment regimes of the form ‘start antiretroviral therapy when CD4 < θ,’ where θ = 200, 201, …, 500. Dynamic marginal structural models (MSMs) (Orellana et al., 2010a,b; Van der Laan & Petersen, 2007), outcome weighted learning (Zhao et al., 2012), and residual weighted learning (Zhou et al., 2017) are all examples of value-search methods. In this paper we focus on dynamic MSMs, a popular value-search method that has been applied in a variety of settings, often focused on when to start treatment (Cain et al., 2010; Caniglia et al., 2017; Cotton & Heagerty, 2014; Ewings et al., 2014; Ford et al., 2015; Garcia-Albeniz et al., 2015; Hernán et al., 2006; HIV-Causal Collaboration, 2011; Neugebauer et al., 2012; Neugebauer et al., 2015; Robins et al., 2008; Rohr et al., 2016; Shen et al., 2017; Shepherd et al., 2010; Shortreed & Moodie, 2012; Sjölander et al., 2011; Van der Laan & Petersen, 2007; Zhang et al., 2014).

We propose a tailoring variable ranking approach that results in a rank ordering of each potential tailoring variable; our methodology is designed for constructing ITRs using univariate value-search based dynamic MSM methodology. We demonstrate the method's ability to rank order tailoring variables in simulations and exemplify its application to data collected from a sequential multiple assignment randomized trial. We first provide some background on value-search methodology, in particular dynamic MSMs, in Section 2, and then introduce our approach for ranking tailoring variables. In Section 3, we report the results of simulations to evaluate the performance of our approach. In Section 4, we demonstrate the application of our proposed approach to construct ITRs for patients with schizophrenia using data collected from the Clinical Antipsychotic Trials of Intervention and Effectiveness (CATIE) (Stroup et al., 2003) and compare our results to those of causal trees for estimating heterogeneous treatment effects (Athey & Imbens, 2016) applied to the same data.

2 METHODS

Our goal for constructing ITRs for patients with schizophrenia is to maximize the time spent on treatment, which indicates tolerance of side effects and adequate symptom management. We consider tailoring individual treatment based on several covariates collected during the CATIE study (Stroup et al., 2003), such as body mass index (BMI), quality of life (QOL) (Heinrichs et al., 1984), and the Calgary depression rating scale (Addington et al., 1990). The two treatments we consider for a tailored choice are a first generation antipsychotic, perphenazine, or a second generation antipsychotic (one of olanzapine, quetiapine or risperidone).

2.1 Review of dynamic marginal structure models

We provide a brief overview of methodology for using the dynamic MSM approach for constructing ITRs. The term ‘dynamic’ here refers to the generalization of ITRs to the setting in which sequential decisions are optimized and where treatments may change dynamically (i.e. due to evolving patient characteristics) over time. For more detailed information on this approach and how to implement dynamic MSMs in practice see Van der Laan and Petersen (2007), Orellana et al. (2010a, b), Cain et al. (2010), Shortreed and Moodie (2012), Cotton and Heagerty (2014).

Denote the continuous outcome of interest Y, with larger values indicating a better outcome (i.e. longer time on acceptable treatment). Let X be a vector of baseline covariates and A a binary treatment indicator, in our example perphenazine or a second generation antipsychotic. The observed data consist of independent observations drawn from (Yi,Xi,Ai) with i = 1, …, n, where n is the number of patients available for constructing ITRs. We use standard counterfactual framework notation in which Y(0) and Y(1) are the potential outcomes that would be observed if the subject was assigned A = 0 or A = 1 respectively. An ITR indexed by a regime parameter θ is denoted by dθ(X), and maps individual covariate values X to treatment A. We use Vθ=E{Y(dθ)} to denote the value function, defined as the mean counterfactual outcome if everyone were to follow the regime dθ. In our setting, a large value of the health outcome of interest is good, thus the optimal ITR is the ITR with the largest value function.

We consider ITRs of the form dθ(T)=1{T<θ} or dθ(T)=1{T>θ}, where T is a possible tailoring variable selected from baseline covariates X and θΘ={θ(1)<θ(2)<θ(s)} represents a particular threshold among a set of possible thresholds under consideration (i.e. Θ represents the set of candidate ITRs involving T). To ease notation throughout this section, we consider only those ITRs of the type dθ(T)=1{T<θ}. Such an ITR assigns subjects A = 1 if T < θ and A = 0 if Tθ. Let θL and θH be the minimum and maximum possible value of T respectively. Then dθL(T) is always 0 and corresponds to treating everyone with A = 0; similarly, dθH(T)=1 for all values of T corresponds to treating everyone with A = 1. Note that θL and θH are not included in Θ. A tailoring variable, T, is deemed useful if, for some θ ∈ Θ, the ITR dθ(T) has higher value than either the rule that specifies treating everyone with A = 0 or that specifies treating everyone with A = 1, expressed as  

To construct the optimal univariate ITR within a given class of regimes, that is, to find the optimal θ ∈ Θ, observations from individuals whose observed history is consistent with each ITR under consideration are used. An individual i is consistent with ITR dθ(T)=1{T<θ} if and only if the individual receives treatment A = 1 when Ti<θ and A = 0 when Tiθ. It is possible that an individual's observed treatment history is consistent with several ITRs corresponding to different θs. For example, if an individual has a BMI of 25 and receives treatment A = 1, this individual's treatment is consistent with both d30(BMI)=1{BMI<30} and d35(BMI)=1{BMI<35}. In order to use dynamic MSM methodology to estimate the optimal ITR, we first create an augmented dataset such that each individual is replicated according to the number of ITRs with which their observed treatment is consistent. Each row in this augmented dataset represents an individual-ITR pair. For the ith individual and jth ITR, Aij=Ai, Xij=Xi, and θ(ij){Θ} where i = 1, …, n, j=1,,ri and ri is the number of regimes that individual i follows.

The relationship between the value function Vθ=E{Y(dθ)} and θ can be modelled by Vθ=m(θ|β) where β is the model parameter and θ is treated as a covariate in the augmented dataset. Setting m(θ|β)=β0+β1θ indicates a linear relationship and m(θ|β)=β0+β1θ+β2θ2 indicates a quadratic relationship. For any potential tailoring variable, a linear relationship between the value function and θ indicates that treat everyone with either A = 0 or A = 1 is the optimal ITR, suggesting this variable is not beneficial for tailoring treatment. The dynamic MSM estimation equation is  
(1)
where wij=wi=1/P(Ai=ai|Xi) is the weight for person i, constructed by taking the inverse probability of individual i receiving their observed treatment, ai, given baseline covariates computed in the original (not augmented) data (Robins et al., 2008; Van der Laan & Petersen, 2007). In randomized trials this probability is known, yet in practice these weights are estimated as using estimated weights has been shown to increase statistical efficiency and decrease finite sample bias (Hirano et al., 2003; Van der Laan & Robins, 2003). Logistic regression models are often used to model the probability of treatment given baseline covariates, Xi, for example:  
where logit(x)=log(x(1x)1). Note that the functional form of continuous variables in the weight model models can be specified independently of the functional form between those covariates and the value function. If the treatment model is correctly specified, the solution to estimating Equation (1) will provide a consistent estimator of β (Orellana et al., 2010a,b; Van der Laan & Petersen, 2007). Thus, a weighted regression model, weighted by the inverse probability of treatment weights, is used to model the relationship between the outcome Y, treatment θ, and covariates; this model provides estimates of the value function for each ITR under consideration.

2.2 Ranking tailoring variables

The goal of ranking potential tailoring variables is to identify those individual variables that may be most useful for creating single-variable ITRs. A variable is considered a useful tailoring variable if the value function for the optimal ITR using that variable is larger than the values corresponding to both treat everyone with A = 0 and treat everyone with A = 1. We aim to both identify potential tailoring variables and rank those variables by their ability to improve outcomes. This is distinctly different from prediction variable ranking strategies or variable importance measures which focus solely on ranking variables by their impact on predicting the outcome. We aim to rank variables by their ability to tailor treatment, regardless of their ability to predict the outcome in the absence of treatment.

To simplify notation, we index the rows of the augmented dataset by l. We use this simplified notation for the augmented data for the rest of the paper with (Y~l,w~l,T~l,θ~l), l = 1, …, N, where N is the number of rows in the augmented dataset (i.e. the number of individual-ITR pairs recalling that ri is the number of ITRs individual i was observed to follow). The outcome for individual i under regime j is constant across considered regimes Y~l=Yi=Yij, for j=1ri, as are the treatment weights, w~l=wi=wij, and the tailoring variable value, T~l=Ti=Tij. We use θ~l, where l=1,,ri, to represent a particular regime that individual i follows. In this slight abuse of notation, we note that θ~l represents a particular value among the set of possible thresholds, Θ. The augmented dataset has N=i=1nri rows and θ~l varies across individual i, taking on the threshold of each ITR that individual i was observed to follow.

We use linear splines to flexibly model the potentially non-linear relationship between the value function, Vθ, and each θ ∈ Θ under consideration. The goal is to select the threshold that corresponds to the maximum value function, thus constructing the optimal ITR. We only consider thresholds, θ, between the 5th and 95th percentile of the observed distribution of the tailoring variable. This is because very few individuals are affected by ITRs indexed by thresholds at the tails of the distribution; the value function for those θ close to the tails are, in general, very close to the value at the lower and upper values of T, θL and θH. Because of this, considering thresholds at the tails of the tailoring variable creates a flattening out of the relationship between the value function and θ, making it more difficult to model this relationship.

In our linear spline models of the relationship between all potential thresholds, θ ∈ Θ, and Y, each potential threshold is a possible knot in the linear spline. We then use penalized regression approaches to shrink some of the coefficients on potential thresholds to zero. We represent a linear spline by (θks)+, where x+=x if x is positive and 0 elsewhere. In order to select which knots, ks for s = 1, …, p, should be used when modelling the relationship between V and θ, we use an 1-penalty on the coefficients of the linear spline. This can be represented as the following optimization problem:  
(2)
where α is an intercept, β0 is the regression coefficient for θ, and β1 through βp are the coefficients on each of the potential linear spline knots. Note that the summation is over rows in the augmented data. The LASSO tuning parameter is represented by λ. We use ηs to represent a knot-specific weight used for penalization in order to shrink some of the knot coefficients to zero. We consider two possible values for ηs: ηs=1, which corresponds to the standard LASSO penalty (Tibshirani, 1996) and ηs=1/β~s2, which corresponds to an adaptive LASSO penalty with γ = 2 (Zou, 2006); β~s is the ordinary weighted least squares estimate for βs.
We consider two different criteria for selecting λ: the Bayesian information criteria (BIC) (Schwarz, 1978) for standard LASSO and the extended regularized information criterion (ERIC) (Hui et al., 2015) for adaptive LASSO. We define the BIC in the augmented dataset as:  
(3)
where wMSE is the weighted mean squared error in the augmented dataset, defined as:  
We define p(λ) as the number of non-zero spline coefficients p(λ)=s=1p1{β^s(λ)0}. ERIC (Hui et al., 2015) contains an additional parameter ν and is calculated as  
(4)

ERIC was proposed as an approach to account for the effect of applying adaptive LASSO on the bias-variance tradeoff (Hui et al., 2015). The first term of ERIC in Equation (4) reflects the bias of the prediction (the goodness of fit), while the second term reflects the variance of the prediction. Variable selection guided by ERIC is consistent, that is, the non-zero coefficients {s:β^s(λ^)0} selected by λ^ that minimize ERIC will approach the true values with high probability as sample size increases (Hui et al., 2015). We aim to choose a sparse model to identify the knot location(s) useful for tailoring, while also allowing flexibility in the fitted relationship between Vθ and θ. Hui et al. (2015) suggest that ERIC will choose more sparse models than BIC. The additional parameter ν controls how much the model fit is penalized; larger ν will result in more penalized estimates, that is, sparser models. Hui et al. (2015) recommend using ν =0.5 or 1; we consider those values as well as smaller values since these select more knots, yielding more flexible models. In simulations, we consider ν = 0.01,0.25, 0.5, 1.

We select tuning parameters separately for standard LASSO and adaptive LASSO from a grid of possible λ1,,λq values. Once the appropriate tuning parameter is selected, it is used to estimate non-zero coefficients for the spline variables. Following the recommendation of Hui et al. (2015), we implement a hybrid approach and re-estimate the model using weighted least squares given the spline variables corresponding to non-zero coefficient values. Let Y^l be the fitted outcome from weighted least squares using the selected knots. The estimated value function for a regime dθ(T)=1{T<θ} is  
We use the maximum estimated value function V^θ over all the candidate θ to rank tailoring variables. That is, for each candidate tailoring variable, we calculate the maximum fitted value function over all considered thresholds; we then rank the tailoring variables by this maximum (fitted) value function. For example, we consider the potential tailoring variable, T1, to be of higher rank than another tailoring variable, T2, if the estimated value function that corresponds to the best ITR using T1 results in a higher value function than the best ITR using T2. For those tailoring variables that are not useful for tailoring, their maximum value function will occur at the value associated with either treat everyone with A = 0 or treat everyone with A = 1, so that all those variables with no tailoring impact have the same rank.

In order to account for the uncertainty of the estimates and rankings, we use the bootstrap method (Efron & Tibshirani, 1994). We resample from the original data (i.e. non-augmented dataset) with replacement, fit a propensity score model for the treatment weights, and create an augmented dataset from the bootstrap sample. Finally, we use the proposed method to select knots and rank tailoring variables, repeating the process many times.

3 SIMULATION

We conducted simulation experiments to assess the performance of the standard LASSO with BIC and adaptive LASSO with ERIC in ranking the importance of six potential tailoring variables for constructing ITRs. We considered variables that were selected to have at least one knot in the modelled relationship between the value function and the threshold to be potential tailoring variables considered for ranking. All six potential tailoring variables, T1,,T6, were generated independently from a standard normal distribution. Treatment was assigned randomly with P(A = 1) = p and P(A = 0) = 1−p for p = 0.2 and 0.5. We generated observations following three different scenarios. In Scenario 1, the four tailoring variables corresponding to the largest value functions are ranked with no ties. In Scenario 2, the difference in the value functions using the two top ranked variables for tailoring treatment, as well as when using the third and fourth ranked variables, was very small. In Scenario 3 (null simulations), no tailoring variables were useful, that is, Y(A)=β1+Aβ2. This scenario was used to evaluate how the ranking strategy would perform if no variables could be used to select optimal treatment, in particular in the null setting where all variables are equally unhelpful.

For each of Scenarios 1 and 2, we generated the potential outcomes, Y(0) and Y(1), with two functional forms. The first functional form was an ideal peaked relationship between the optimal threshold and the value function.  
(5)
where ϵN(0,0.72). The second functional form smoothed out this peak and generated the potential outcomes, Y(0) and Y(1), using the following equation:  
where Expit(x) = exp (x)/(1+ exp (x)) and c denotes a positive constant that controls the amount by which the peak is smoothed.

It is not straightforward to construct a value function with multiple tailoring variables. In online Appendix A, we show how to set up the parameters, under each functional form, such that individual tailoring variables are informative. The true value function for each of the tailoring variables for Scenarios 1 and 2 is shown in Figure 1. In Scenario 1, β=(6,25,17,16.5,9,8,1,0)T and in Scenario 2, β=(6,25,20,20,5,5,1,0)T. In both scenarios, θ=(0.84,0.84,0.84,0.84,0,0)T. For both scenarios, the true value functions had the following rank ordering of potential tailoring variables: T1>T2>T3>T4>T5=T6, with T1,,T4 informative for tailoring treatment and T5,T6 not informative for tailoring treatment. For the peaked functional form, in both Scenarios 1 and 2, the critical points of the true value function (i.e. the data generating process) are exactly the θ values. However, the critical points for T5 and T6 were not useful for tailoring treatment, meaning that treat everyone with A = 0 or A = 1 was the best regime when considering each of those tailoring variables. For the smoothed functional form generated using the expit function, the critical points of the true value function vary based on the β coefficients and c.

True value functions for the six tailoring variables T1,…,T6. Scenario 1 is on the left (a) and scenario 2 is on the right (b). The first row contains the peaked functional form generated with the indicator function. The second row represents the smoothed functional form with c = 5 and the third row represents the smoothed functional form with c = 2
FIGURE 1

True value functions for the six tailoring variables T1,,T6. Scenario 1 is on the left (a) and scenario 2 is on the right (b). The first row contains the peaked functional form generated with the indicator function. The second row represents the smoothed functional form with c = 5 and the third row represents the smoothed functional form with c = 2

Regardless of the underlying data generating process for the counterfactual outcomes, that is, smoothed or peaked, tailoring variables or no tailoring variables, we used the same estimation process in each simulation iteration. For each potential tailoring variable, Th, h = 1, …, 6, we considered ITRs of the form dθh(Th)=1{Th<θh}, where candidate thresholds, Θh, were equally spaced points between −2 and 2 with increments of 0.2. When modelling the relationship between possible thresholds and the value function in the augmented data, we considered all internal knots. Before implementing shrinkage to select the knots, we standardized the linear and spline terms by dividing each variable by its respective absolute maximum value, so each variable was between −1 and 1.

During the estimation portion of the simulations, the probability of treatment, used to construct the inverse probability weights, was estimated by logistic regression with only an intercept. For the tuning parameter λ, we considered 100 equally spaced values between 104λmax and λmax where λmax was the minimum λ such that all coefficients are zero. We varied the ERIC parameter ν = 0.01, 0.25, 0.5, 1 and compared results using adaptive LASSO with ERIC and standard LASSO with BIC across a variety of sample sizes, n=500,1000,10000.

3.1 Simulation results

The results comparing adaptive LASSO with ERIC using different values of ν and standard LASSO with BIC under the data generating Scenario 1 are shown in Tables 1–3. We present the proportion of times that each tailoring variable was ranked in each position over 1000 simulation replications, the average number of knots, and the average estimated value function with its empirical standard deviation over the simulation replications. Recall, the true ranking was T1>T2>T3>T4>T5=T6. We present these results for the counterfactual data generating process that uses the Expit function with c = 5 (Figure 1, row 2). Results for c = 2 and the data generating process that uses the indicator function are in online Appendix B.

TABLE 1

Scenario 1 with n = 500, c = 5, and p = 0.5. Rows are tailoring variables and columns are the ranks. We show the proportion of times that tailoring variables are ranked over 1000 replications, together with the average number of knots and average value functions with their standard deviations. The true (maximum) value for each tailoring variable is (9.19, 8.60, 8.10, 7.47, 7.18, 7.18)

R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T196.8%3.2%0%0%0%0%4.539.25 (0.41)
T23.2%88.7%8.1%0%0%0%3.478.67 (0.37)
T30%8.1%84.6%7.3%0%0%3.528.15 (0.49)
T40%0%7.2%71.5%12.4%8.9%2.447.62 (0.40)
T50%0%0%12.1%47.5%40.4%1.997.25 (0.54)
T60%0%0.1%9.1%40.1%50.7%1.767.22 (0.56)
ν = 0.25
T196.8%3.2%0%0%0%0%1.459.19 (0.42)
T22.9%89.0%7.4%0.3%0.4%0%1.218.60 (0.38)
T30.3%7.6%78.3%12.4%1.4%0%1.068.04 (0.51)
T40%0.1%11.5%59.1%16.6%12.7%0.757.60 (0.45)
T50%0%2.0%17.7%48.9%31.4%0.037.32 (0.62)
T60%0.1%0.8%10.5%32.7%55.9%0.017.25 (0.62)
ν = 0.5
T194.6%5.0%0.4%0%0%0%1.189.14 (0.41)
T25.0%86.1%8.0%0.4%0.3%0.2%1.128.60 (0.38)
T30.4%8.8%81.9%8.2%0.6%0.1%0.648.06 (0.52)
T40%0.1%8.4%64.2%18.9%8.4%0.327.59 (0.53)
T50%0%0.8%17.2%49.4%32.6%0.007.33 (0.63)
T60%0%0.5%10.0%30.8%58.7%0.007.25 (0.62)
ν = 1
T191.9%6.8%1.3%0%0%0%1.079.11 (0.42)
T26.6%75.0%15.7%2.1%0.4%0.2%0.928.54 (0.44)
T31.5%17.5%77.9%3.1%0%0%0.168.10 (0.60)
T40%0.7%4.5%69.1%19.6%6.1%0.027.54 (0.58)
T50%0%0.5%16.5%49.6%33.4%0.007.33 (0.63)
T60%0%0.1%9.2%30.4%60.3%0.007.25 (0.62)
BIC
T193.8%5.9%0.3%0%0%0%1.739.20 (0.40)
T26.1%87.0%6.7%0.2%0%0%1.728.71 (0.41)
T30.1%6.9%86.1%6.8%0.1%0%0.988.15 (0.53)
T40%0.2%6.5%68.3%18.4%6.6%0.277.60 (0.56)
T50%0%0.4%15.9%49.9%33.8%0.007.33 (0.63)
T60%0%0%8.8%31.6%59.6%0.007.25 (0.62)
R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T196.8%3.2%0%0%0%0%4.539.25 (0.41)
T23.2%88.7%8.1%0%0%0%3.478.67 (0.37)
T30%8.1%84.6%7.3%0%0%3.528.15 (0.49)
T40%0%7.2%71.5%12.4%8.9%2.447.62 (0.40)
T50%0%0%12.1%47.5%40.4%1.997.25 (0.54)
T60%0%0.1%9.1%40.1%50.7%1.767.22 (0.56)
ν = 0.25
T196.8%3.2%0%0%0%0%1.459.19 (0.42)
T22.9%89.0%7.4%0.3%0.4%0%1.218.60 (0.38)
T30.3%7.6%78.3%12.4%1.4%0%1.068.04 (0.51)
T40%0.1%11.5%59.1%16.6%12.7%0.757.60 (0.45)
T50%0%2.0%17.7%48.9%31.4%0.037.32 (0.62)
T60%0.1%0.8%10.5%32.7%55.9%0.017.25 (0.62)
ν = 0.5
T194.6%5.0%0.4%0%0%0%1.189.14 (0.41)
T25.0%86.1%8.0%0.4%0.3%0.2%1.128.60 (0.38)
T30.4%8.8%81.9%8.2%0.6%0.1%0.648.06 (0.52)
T40%0.1%8.4%64.2%18.9%8.4%0.327.59 (0.53)
T50%0%0.8%17.2%49.4%32.6%0.007.33 (0.63)
T60%0%0.5%10.0%30.8%58.7%0.007.25 (0.62)
ν = 1
T191.9%6.8%1.3%0%0%0%1.079.11 (0.42)
T26.6%75.0%15.7%2.1%0.4%0.2%0.928.54 (0.44)
T31.5%17.5%77.9%3.1%0%0%0.168.10 (0.60)
T40%0.7%4.5%69.1%19.6%6.1%0.027.54 (0.58)
T50%0%0.5%16.5%49.6%33.4%0.007.33 (0.63)
T60%0%0.1%9.2%30.4%60.3%0.007.25 (0.62)
BIC
T193.8%5.9%0.3%0%0%0%1.739.20 (0.40)
T26.1%87.0%6.7%0.2%0%0%1.728.71 (0.41)
T30.1%6.9%86.1%6.8%0.1%0%0.988.15 (0.53)
T40%0.2%6.5%68.3%18.4%6.6%0.277.60 (0.56)
T50%0%0.4%15.9%49.9%33.8%0.007.33 (0.63)
T60%0%0%8.8%31.6%59.6%0.007.25 (0.62)
TABLE 1

Scenario 1 with n = 500, c = 5, and p = 0.5. Rows are tailoring variables and columns are the ranks. We show the proportion of times that tailoring variables are ranked over 1000 replications, together with the average number of knots and average value functions with their standard deviations. The true (maximum) value for each tailoring variable is (9.19, 8.60, 8.10, 7.47, 7.18, 7.18)

R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T196.8%3.2%0%0%0%0%4.539.25 (0.41)
T23.2%88.7%8.1%0%0%0%3.478.67 (0.37)
T30%8.1%84.6%7.3%0%0%3.528.15 (0.49)
T40%0%7.2%71.5%12.4%8.9%2.447.62 (0.40)
T50%0%0%12.1%47.5%40.4%1.997.25 (0.54)
T60%0%0.1%9.1%40.1%50.7%1.767.22 (0.56)
ν = 0.25
T196.8%3.2%0%0%0%0%1.459.19 (0.42)
T22.9%89.0%7.4%0.3%0.4%0%1.218.60 (0.38)
T30.3%7.6%78.3%12.4%1.4%0%1.068.04 (0.51)
T40%0.1%11.5%59.1%16.6%12.7%0.757.60 (0.45)
T50%0%2.0%17.7%48.9%31.4%0.037.32 (0.62)
T60%0.1%0.8%10.5%32.7%55.9%0.017.25 (0.62)
ν = 0.5
T194.6%5.0%0.4%0%0%0%1.189.14 (0.41)
T25.0%86.1%8.0%0.4%0.3%0.2%1.128.60 (0.38)
T30.4%8.8%81.9%8.2%0.6%0.1%0.648.06 (0.52)
T40%0.1%8.4%64.2%18.9%8.4%0.327.59 (0.53)
T50%0%0.8%17.2%49.4%32.6%0.007.33 (0.63)
T60%0%0.5%10.0%30.8%58.7%0.007.25 (0.62)
ν = 1
T191.9%6.8%1.3%0%0%0%1.079.11 (0.42)
T26.6%75.0%15.7%2.1%0.4%0.2%0.928.54 (0.44)
T31.5%17.5%77.9%3.1%0%0%0.168.10 (0.60)
T40%0.7%4.5%69.1%19.6%6.1%0.027.54 (0.58)
T50%0%0.5%16.5%49.6%33.4%0.007.33 (0.63)
T60%0%0.1%9.2%30.4%60.3%0.007.25 (0.62)
BIC
T193.8%5.9%0.3%0%0%0%1.739.20 (0.40)
T26.1%87.0%6.7%0.2%0%0%1.728.71 (0.41)
T30.1%6.9%86.1%6.8%0.1%0%0.988.15 (0.53)
T40%0.2%6.5%68.3%18.4%6.6%0.277.60 (0.56)
T50%0%0.4%15.9%49.9%33.8%0.007.33 (0.63)
T60%0%0%8.8%31.6%59.6%0.007.25 (0.62)
R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T196.8%3.2%0%0%0%0%4.539.25 (0.41)
T23.2%88.7%8.1%0%0%0%3.478.67 (0.37)
T30%8.1%84.6%7.3%0%0%3.528.15 (0.49)
T40%0%7.2%71.5%12.4%8.9%2.447.62 (0.40)
T50%0%0%12.1%47.5%40.4%1.997.25 (0.54)
T60%0%0.1%9.1%40.1%50.7%1.767.22 (0.56)
ν = 0.25
T196.8%3.2%0%0%0%0%1.459.19 (0.42)
T22.9%89.0%7.4%0.3%0.4%0%1.218.60 (0.38)
T30.3%7.6%78.3%12.4%1.4%0%1.068.04 (0.51)
T40%0.1%11.5%59.1%16.6%12.7%0.757.60 (0.45)
T50%0%2.0%17.7%48.9%31.4%0.037.32 (0.62)
T60%0.1%0.8%10.5%32.7%55.9%0.017.25 (0.62)
ν = 0.5
T194.6%5.0%0.4%0%0%0%1.189.14 (0.41)
T25.0%86.1%8.0%0.4%0.3%0.2%1.128.60 (0.38)
T30.4%8.8%81.9%8.2%0.6%0.1%0.648.06 (0.52)
T40%0.1%8.4%64.2%18.9%8.4%0.327.59 (0.53)
T50%0%0.8%17.2%49.4%32.6%0.007.33 (0.63)
T60%0%0.5%10.0%30.8%58.7%0.007.25 (0.62)
ν = 1
T191.9%6.8%1.3%0%0%0%1.079.11 (0.42)
T26.6%75.0%15.7%2.1%0.4%0.2%0.928.54 (0.44)
T31.5%17.5%77.9%3.1%0%0%0.168.10 (0.60)
T40%0.7%4.5%69.1%19.6%6.1%0.027.54 (0.58)
T50%0%0.5%16.5%49.6%33.4%0.007.33 (0.63)
T60%0%0.1%9.2%30.4%60.3%0.007.25 (0.62)
BIC
T193.8%5.9%0.3%0%0%0%1.739.20 (0.40)
T26.1%87.0%6.7%0.2%0%0%1.728.71 (0.41)
T30.1%6.9%86.1%6.8%0.1%0%0.988.15 (0.53)
T40%0.2%6.5%68.3%18.4%6.6%0.277.60 (0.56)
T50%0%0.4%15.9%49.9%33.8%0.007.33 (0.63)
T60%0%0%8.8%31.6%59.6%0.007.25 (0.62)
TABLE 2

Scenario 1 with n = 1000, c = 5, and p = 0.5. Rows are tailoring variables and columns are the ranks. We show the proportion of times that tailoring variables are ranked over 1000 replications, together with the average number of knots and average value functions with their standard deviations. The true (maximum) value for each tailoring variable is (9.19, 8.60, 8.10, 7.47, 7.18, 7.18)

R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T199.9%0.1%0%0%0%0%4.939.24 (0.28)
T20.1%98.1%1.8%0%0%0%4.108.65 (0.26)
T30%1.8%97.4%0.8%0%0%3.838.14 (0.32)
T40%0%0.8%85.8%8.6%4.8%2.607.56 (0.27)
T50%0%0%8.5%48.6%42.9%2.167.24 (0.39)
T60%0%0%4.9%42.8%52.3%2.177.22 (0.40)
ν = 0.25
T1100%0%0%0%0%0%1.719.23 (0.30)
T20%98.4%1.5%0.1%0%0%1.268.61 (0.27)
T30%1.6%94.3%4.0%0.1%0%1.298.05 (0.34)
T40%0%3.5%64.2%18.4%13.9%1.077.55 (0.29)
T50%0%0.7%21.3%50.7%27.3%0.057.35 (0.44)
T60%0%0%10.4%30.8%58.8%0.017.28 (0.44)
ν = 0.5
T199.5%0.5%0%0%0%0%1.329.19 (0.30)
T20.5%98.0%1.4%0.1%0%0%1.178.61 (0.27)
T30%1.4%91.5%6.9%0.2%0%1.048.02 (0.34)
T40%0.1%6.0%66.1%18.2%9.6%0.737.57 (0.32)
T50%0%1.0%19.0%52.0%28.0%0.007.36 (0.44)
T60%0%0.1%7.9%29.6%62.4%0.007.28 (0.45)
ν = 1
T198.5%1.4%0%0.1%0%0%1.139.16 (0.30)
T21.4%96.1%2.4%0.1%0%0%1.108.61 (0.27)
T30.1%2.3%94.8%2.8%0%0%0.618.08 (0.35)
T40%0.1%2.7%79.0%14.4%3.8%0.177.57 (0.41)
T50%0.1%0.1%13.5%56.2%30.1%0.007.36 (0.44)
T60%0%0%4.5%29.4%66.1%0.007.28 (0.45)
BIC
T198.1%1.9%0%0%0%0%1.909.17 (0.29)
T21.9%96.9%1.1%0.1%0%0%1.948.73 (0.29)
T30%1.2%97.0%1.8%0%0%1.638.13 (0.32)
T40%0%1.7%81.4%13.2%3.7%0.817.63 (0.38)
T50%0%0.2%12.0%57.7%30.1%0.017.36 (0.44)
T60%0%0%4.7%29.1%66.2%0.007.28 (0.45)
R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T199.9%0.1%0%0%0%0%4.939.24 (0.28)
T20.1%98.1%1.8%0%0%0%4.108.65 (0.26)
T30%1.8%97.4%0.8%0%0%3.838.14 (0.32)
T40%0%0.8%85.8%8.6%4.8%2.607.56 (0.27)
T50%0%0%8.5%48.6%42.9%2.167.24 (0.39)
T60%0%0%4.9%42.8%52.3%2.177.22 (0.40)
ν = 0.25
T1100%0%0%0%0%0%1.719.23 (0.30)
T20%98.4%1.5%0.1%0%0%1.268.61 (0.27)
T30%1.6%94.3%4.0%0.1%0%1.298.05 (0.34)
T40%0%3.5%64.2%18.4%13.9%1.077.55 (0.29)
T50%0%0.7%21.3%50.7%27.3%0.057.35 (0.44)
T60%0%0%10.4%30.8%58.8%0.017.28 (0.44)
ν = 0.5
T199.5%0.5%0%0%0%0%1.329.19 (0.30)
T20.5%98.0%1.4%0.1%0%0%1.178.61 (0.27)
T30%1.4%91.5%6.9%0.2%0%1.048.02 (0.34)
T40%0.1%6.0%66.1%18.2%9.6%0.737.57 (0.32)
T50%0%1.0%19.0%52.0%28.0%0.007.36 (0.44)
T60%0%0.1%7.9%29.6%62.4%0.007.28 (0.45)
ν = 1
T198.5%1.4%0%0.1%0%0%1.139.16 (0.30)
T21.4%96.1%2.4%0.1%0%0%1.108.61 (0.27)
T30.1%2.3%94.8%2.8%0%0%0.618.08 (0.35)
T40%0.1%2.7%79.0%14.4%3.8%0.177.57 (0.41)
T50%0.1%0.1%13.5%56.2%30.1%0.007.36 (0.44)
T60%0%0%4.5%29.4%66.1%0.007.28 (0.45)
BIC
T198.1%1.9%0%0%0%0%1.909.17 (0.29)
T21.9%96.9%1.1%0.1%0%0%1.948.73 (0.29)
T30%1.2%97.0%1.8%0%0%1.638.13 (0.32)
T40%0%1.7%81.4%13.2%3.7%0.817.63 (0.38)
T50%0%0.2%12.0%57.7%30.1%0.017.36 (0.44)
T60%0%0%4.7%29.1%66.2%0.007.28 (0.45)
TABLE 2

Scenario 1 with n = 1000, c = 5, and p = 0.5. Rows are tailoring variables and columns are the ranks. We show the proportion of times that tailoring variables are ranked over 1000 replications, together with the average number of knots and average value functions with their standard deviations. The true (maximum) value for each tailoring variable is (9.19, 8.60, 8.10, 7.47, 7.18, 7.18)

R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T199.9%0.1%0%0%0%0%4.939.24 (0.28)
T20.1%98.1%1.8%0%0%0%4.108.65 (0.26)
T30%1.8%97.4%0.8%0%0%3.838.14 (0.32)
T40%0%0.8%85.8%8.6%4.8%2.607.56 (0.27)
T50%0%0%8.5%48.6%42.9%2.167.24 (0.39)
T60%0%0%4.9%42.8%52.3%2.177.22 (0.40)
ν = 0.25
T1100%0%0%0%0%0%1.719.23 (0.30)
T20%98.4%1.5%0.1%0%0%1.268.61 (0.27)
T30%1.6%94.3%4.0%0.1%0%1.298.05 (0.34)
T40%0%3.5%64.2%18.4%13.9%1.077.55 (0.29)
T50%0%0.7%21.3%50.7%27.3%0.057.35 (0.44)
T60%0%0%10.4%30.8%58.8%0.017.28 (0.44)
ν = 0.5
T199.5%0.5%0%0%0%0%1.329.19 (0.30)
T20.5%98.0%1.4%0.1%0%0%1.178.61 (0.27)
T30%1.4%91.5%6.9%0.2%0%1.048.02 (0.34)
T40%0.1%6.0%66.1%18.2%9.6%0.737.57 (0.32)
T50%0%1.0%19.0%52.0%28.0%0.007.36 (0.44)
T60%0%0.1%7.9%29.6%62.4%0.007.28 (0.45)
ν = 1
T198.5%1.4%0%0.1%0%0%1.139.16 (0.30)
T21.4%96.1%2.4%0.1%0%0%1.108.61 (0.27)
T30.1%2.3%94.8%2.8%0%0%0.618.08 (0.35)
T40%0.1%2.7%79.0%14.4%3.8%0.177.57 (0.41)
T50%0.1%0.1%13.5%56.2%30.1%0.007.36 (0.44)
T60%0%0%4.5%29.4%66.1%0.007.28 (0.45)
BIC
T198.1%1.9%0%0%0%0%1.909.17 (0.29)
T21.9%96.9%1.1%0.1%0%0%1.948.73 (0.29)
T30%1.2%97.0%1.8%0%0%1.638.13 (0.32)
T40%0%1.7%81.4%13.2%3.7%0.817.63 (0.38)
T50%0%0.2%12.0%57.7%30.1%0.017.36 (0.44)
T60%0%0%4.7%29.1%66.2%0.007.28 (0.45)
R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T199.9%0.1%0%0%0%0%4.939.24 (0.28)
T20.1%98.1%1.8%0%0%0%4.108.65 (0.26)
T30%1.8%97.4%0.8%0%0%3.838.14 (0.32)
T40%0%0.8%85.8%8.6%4.8%2.607.56 (0.27)
T50%0%0%8.5%48.6%42.9%2.167.24 (0.39)
T60%0%0%4.9%42.8%52.3%2.177.22 (0.40)
ν = 0.25
T1100%0%0%0%0%0%1.719.23 (0.30)
T20%98.4%1.5%0.1%0%0%1.268.61 (0.27)
T30%1.6%94.3%4.0%0.1%0%1.298.05 (0.34)
T40%0%3.5%64.2%18.4%13.9%1.077.55 (0.29)
T50%0%0.7%21.3%50.7%27.3%0.057.35 (0.44)
T60%0%0%10.4%30.8%58.8%0.017.28 (0.44)
ν = 0.5
T199.5%0.5%0%0%0%0%1.329.19 (0.30)
T20.5%98.0%1.4%0.1%0%0%1.178.61 (0.27)
T30%1.4%91.5%6.9%0.2%0%1.048.02 (0.34)
T40%0.1%6.0%66.1%18.2%9.6%0.737.57 (0.32)
T50%0%1.0%19.0%52.0%28.0%0.007.36 (0.44)
T60%0%0.1%7.9%29.6%62.4%0.007.28 (0.45)
ν = 1
T198.5%1.4%0%0.1%0%0%1.139.16 (0.30)
T21.4%96.1%2.4%0.1%0%0%1.108.61 (0.27)
T30.1%2.3%94.8%2.8%0%0%0.618.08 (0.35)
T40%0.1%2.7%79.0%14.4%3.8%0.177.57 (0.41)
T50%0.1%0.1%13.5%56.2%30.1%0.007.36 (0.44)
T60%0%0%4.5%29.4%66.1%0.007.28 (0.45)
BIC
T198.1%1.9%0%0%0%0%1.909.17 (0.29)
T21.9%96.9%1.1%0.1%0%0%1.948.73 (0.29)
T30%1.2%97.0%1.8%0%0%1.638.13 (0.32)
T40%0%1.7%81.4%13.2%3.7%0.817.63 (0.38)
T50%0%0.2%12.0%57.7%30.1%0.017.36 (0.44)
T60%0%0%4.7%29.1%66.2%0.007.28 (0.45)
TABLE 3

Scenario 1 with n=10000, c = 5, and p = 0.5. Rows are tailoring variables and columns are the ranks. We show the proportion of times that tailoring variables are ranked over 1000 replications, together with the average number of knots and average value functions with their standard deviations. The true (maximum) value for each tailoring variable is (9.19, 8.60, 8.10, 7.47, 7.18, 7.18)

R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T1100%0%0%0%0%0%9.469.18 (0.09)
T20%100%0%0%0%0%8.268.60 (0.08)
T30%0%100%0%0%0%5.938.12 (0.11)
T40%0%0%99.8%0.2%0%4.227.48 (0.08)
T50%0%0%0.2%55.4%44.4%4.337.17 (0.13)
T60%0%0%0%44.4%55.6%4.227.16 (0.13)
ν = 0.25
T1100%0%0%0%0%0%3.289.25 (0.12)
T20%100%0%0%0%0%1.658.60 (0.09)
T30%0%100%0%0%0%2.498.14 (0.12)
T40%0%0%93.1%4.7%2.2%1.327.46 (0.09)
T50%0%0%5.3%47.9%46.8%0.587.26 (0.14)
T60%0%0%1.6%47.4%51.0%0.027.25 (0.14)
ν = 0.5
T1100%0%0%0%0%0%2.909.26 (0.12)
T20%100%0%0%0%0%1.378.60 (0.09)
T30%0%100%0%0%0%1.818.13 (0.12)
T40%0%0%86.4%10.0%3.6%1.217.46 (0.09)
T50%0%0%12.5%70.8%16.7%0.147.31 (0.14)
T60%0%0%1.1%19.2%79.7%0.007.25 (0.14)
ν = 1
T1100%0%0%0%0%0%2.089.25 (0.12)
T20%100%0%0%0%0%1.218.60 (0.09)
T30%0%100%0%0%0%1.248.08 (0.12)
T40%0%0%82.4%13.0%4.6%1.147.45 (0.09)
T50%0%0%17.3%77.2%5.5%0.007.33 (0.14)
T60%0%0%0.3%9.8%89.9%0.007.25 (0.14)
BIC
T1100%0%0%0%0%0%5.669.16 (0.10)
T20%100%0%0%0%0%3.828.60 (0.08)
T30%0%100%0%0%0%2.278.08 (0.11)
T40%0%0%94.8%4.0%1.2%3.187.50 (0.09)
T50%0%0%4.9%82.4%12.7%0.157.32 (0.14)
T60%0%0%0.3%13.6%86.1%0.007.25 (0.14)
R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T1100%0%0%0%0%0%9.469.18 (0.09)
T20%100%0%0%0%0%8.268.60 (0.08)
T30%0%100%0%0%0%5.938.12 (0.11)
T40%0%0%99.8%0.2%0%4.227.48 (0.08)
T50%0%0%0.2%55.4%44.4%4.337.17 (0.13)
T60%0%0%0%44.4%55.6%4.227.16 (0.13)
ν = 0.25
T1100%0%0%0%0%0%3.289.25 (0.12)
T20%100%0%0%0%0%1.658.60 (0.09)
T30%0%100%0%0%0%2.498.14 (0.12)
T40%0%0%93.1%4.7%2.2%1.327.46 (0.09)
T50%0%0%5.3%47.9%46.8%0.587.26 (0.14)
T60%0%0%1.6%47.4%51.0%0.027.25 (0.14)
ν = 0.5
T1100%0%0%0%0%0%2.909.26 (0.12)
T20%100%0%0%0%0%1.378.60 (0.09)
T30%0%100%0%0%0%1.818.13 (0.12)
T40%0%0%86.4%10.0%3.6%1.217.46 (0.09)
T50%0%0%12.5%70.8%16.7%0.147.31 (0.14)
T60%0%0%1.1%19.2%79.7%0.007.25 (0.14)
ν = 1
T1100%0%0%0%0%0%2.089.25 (0.12)
T20%100%0%0%0%0%1.218.60 (0.09)
T30%0%100%0%0%0%1.248.08 (0.12)
T40%0%0%82.4%13.0%4.6%1.147.45 (0.09)
T50%0%0%17.3%77.2%5.5%0.007.33 (0.14)
T60%0%0%0.3%9.8%89.9%0.007.25 (0.14)
BIC
T1100%0%0%0%0%0%5.669.16 (0.10)
T20%100%0%0%0%0%3.828.60 (0.08)
T30%0%100%0%0%0%2.278.08 (0.11)
T40%0%0%94.8%4.0%1.2%3.187.50 (0.09)
T50%0%0%4.9%82.4%12.7%0.157.32 (0.14)
T60%0%0%0.3%13.6%86.1%0.007.25 (0.14)
TABLE 3

Scenario 1 with n=10000, c = 5, and p = 0.5. Rows are tailoring variables and columns are the ranks. We show the proportion of times that tailoring variables are ranked over 1000 replications, together with the average number of knots and average value functions with their standard deviations. The true (maximum) value for each tailoring variable is (9.19, 8.60, 8.10, 7.47, 7.18, 7.18)

R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T1100%0%0%0%0%0%9.469.18 (0.09)
T20%100%0%0%0%0%8.268.60 (0.08)
T30%0%100%0%0%0%5.938.12 (0.11)
T40%0%0%99.8%0.2%0%4.227.48 (0.08)
T50%0%0%0.2%55.4%44.4%4.337.17 (0.13)
T60%0%0%0%44.4%55.6%4.227.16 (0.13)
ν = 0.25
T1100%0%0%0%0%0%3.289.25 (0.12)
T20%100%0%0%0%0%1.658.60 (0.09)
T30%0%100%0%0%0%2.498.14 (0.12)
T40%0%0%93.1%4.7%2.2%1.327.46 (0.09)
T50%0%0%5.3%47.9%46.8%0.587.26 (0.14)
T60%0%0%1.6%47.4%51.0%0.027.25 (0.14)
ν = 0.5
T1100%0%0%0%0%0%2.909.26 (0.12)
T20%100%0%0%0%0%1.378.60 (0.09)
T30%0%100%0%0%0%1.818.13 (0.12)
T40%0%0%86.4%10.0%3.6%1.217.46 (0.09)
T50%0%0%12.5%70.8%16.7%0.147.31 (0.14)
T60%0%0%1.1%19.2%79.7%0.007.25 (0.14)
ν = 1
T1100%0%0%0%0%0%2.089.25 (0.12)
T20%100%0%0%0%0%1.218.60 (0.09)
T30%0%100%0%0%0%1.248.08 (0.12)
T40%0%0%82.4%13.0%4.6%1.147.45 (0.09)
T50%0%0%17.3%77.2%5.5%0.007.33 (0.14)
T60%0%0%0.3%9.8%89.9%0.007.25 (0.14)
BIC
T1100%0%0%0%0%0%5.669.16 (0.10)
T20%100%0%0%0%0%3.828.60 (0.08)
T30%0%100%0%0%0%2.278.08 (0.11)
T40%0%0%94.8%4.0%1.2%3.187.50 (0.09)
T50%0%0%4.9%82.4%12.7%0.157.32 (0.14)
T60%0%0%0.3%13.6%86.1%0.007.25 (0.14)
R1R2R3R4R5R6Number of knotsValue (SD)
ν = 0.01
T1100%0%0%0%0%0%9.469.18 (0.09)
T20%100%0%0%0%0%8.268.60 (0.08)
T30%0%100%0%0%0%5.938.12 (0.11)
T40%0%0%99.8%0.2%0%4.227.48 (0.08)
T50%0%0%0.2%55.4%44.4%4.337.17 (0.13)
T60%0%0%0%44.4%55.6%4.227.16 (0.13)
ν = 0.25
T1100%0%0%0%0%0%3.289.25 (0.12)
T20%100%0%0%0%0%1.658.60 (0.09)
T30%0%100%0%0%0%2.498.14 (0.12)
T40%0%0%93.1%4.7%2.2%1.327.46 (0.09)
T50%0%0%5.3%47.9%46.8%0.587.26 (0.14)
T60%0%0%1.6%47.4%51.0%0.027.25 (0.14)
ν = 0.5
T1100%0%0%0%0%0%2.909.26 (0.12)
T20%100%0%0%0%0%1.378.60 (0.09)
T30%0%100%0%0%0%1.818.13 (0.12)
T40%0%0%86.4%10.0%3.6%1.217.46 (0.09)
T50%0%0%12.5%70.8%16.7%0.147.31 (0.14)
T60%0%0%1.1%19.2%79.7%0.007.25 (0.14)
ν = 1
T1100%0%0%0%0%0%2.089.25 (0.12)
T20%100%0%0%0%0%1.218.60 (0.09)
T30%0%100%0%0%0%1.248.08 (0.12)
T40%0%0%82.4%13.0%4.6%1.147.45 (0.09)
T50%0%0%17.3%77.2%5.5%0.007.33 (0.14)
T60%0%0%0.3%9.8%89.9%0.007.25 (0.14)
BIC
T1100%0%0%0%0%0%5.669.16 (0.10)
T20%100%0%0%0%0%3.828.60 (0.08)
T30%0%100%0%0%0%2.278.08 (0.11)
T40%0%0%94.8%4.0%1.2%3.187.50 (0.09)
T50%0%0%4.9%82.4%12.7%0.157.32 (0.14)
T60%0%0%0.3%13.6%86.1%0.007.25 (0.14)

Both ERIC and BIC ranked the tailoring variables correctly, even for n = 500; as n increased results were more stable. For all values of n, ERIC with smaller ν tended to select more knots and gave more flexible models. For ERIC with ν = 0.25, 0.5, 1, the average number of selected knots was close to 1 for T1,,T4 for all sample sizes considered, whereas BIC chose more knots as n increased. In the true data generating process, there was one smooth peak, which would be approximated with one knot for T1,,T4, thus we conclude that ERIC performed better than BIC in selecting the correct knot and identifying variables for tailoring treatment, which resulted in accurate ranking of the six tailoring variables. Results for simulations that use the indicator function in the data generation function were similar to the results with c = 5 presented here (see online Appendix B). Results with c = 2 are similar although slightly noisier, especially for n = 500 (online Appendix B). The results of scenario 2 were similar and are shown in online Appendix B. The results for unequal randomization (p = 0.2) were also similar though, as expected, somewhat more variable and are not shown. The results for simulations with no tailoring variables showed that while knots were still selected for each considered tailoring variable, across the simulations every potential tailoring variable was placed at each ranking an equal number of times with value functions all very similar. These results indicate that no variable considered would be selected as a tailoring variable over the others (see online Appendix B for detailed results).

4 CONSTRUCTING ITRS FOR INDIVIDUALS WITH SCHIZOPHRENIA

Our work was motivated by the desire to identify useful variables for tailoring treatment for individuals with schizophrenia using data gathered from the National Institute of Mental Health funded CATIE study (Stroup et al., 2003), an 18-month multistage randomized trial of 1460 patients with schizophrenia. Patients were randomized to two arms. One arm received perphenazine, a first generation antipsychotic, and the second arm received one of four possible atypical (i.e. second generation) antipsychotic medications: olanzapine, risperidone, quetiapine or ziprasidone. Ziprasidone was approved after the start of the trial, thus we exclude patients randomized to ziprasidone from our analysis. Allocation to each of the atypical medications in the second arm was by randomization with equal probabilities.

CATIE participants could remain in the study and switch to another randomized treatment at any time in the 18 months of the study if they were not satisfied with the first medication to which they were randomized. Patients selected to remain on their original mediation or to switch to another medication in consultation with their physician; the most common reasons for discontinuing medication were due to intolerable side effects or because the medication did not adequately control symptoms. The primary outcome in CATIE was time (from study enrolment, when a participant began their assigned treatment) until treatment discontinuation (Stroup et al. 2003), following this we also used time on first randomly assigned medication as our primary outcome. For participants who did not switch medications (i.e. participants who stayed on their first randomly assigned medication the entire study), time on their first assigned medication was set to 18 months, the maximum follow-up time. There were no missing data in the primary outcome. To address the small amount of missing data in baseline measures, we used a singly imputed dataset (Shortreed et al., 2014).

Of the 1460 CATIE participants, 384 were randomized to ziprasidone and were excluded from the analysis. The remaining 1076 individuals were randomized to receive an atypical medication, n = 815 or perphenazine, n = 261. Throughout, we denote treatment with an atypical antipsychotic by A = 1 and treatment with perphenazine by A = 0. We considered seven potential tailoring variables for individualizing treatment selection: BMI, the QOL scale (Heinrichs et al., 1984), the mental and physical health components of the SF-12 (MHCS-12 and PHCS-12 respectively) (Ware Jr et al., 2002), the Calgary depression rating scale (Addington et al., 1990), the positive and negative syndrome scale (PANSS) (Kay et al., 1987), and the number of years on prescription medication prior to enrolment in CATIE (denoted ‘years of medication prior to CATIE’). See online Appendix D for summaries of each of these potential tailoring variables, including the amount of missing information, which was imputed, in each variable.

We aimed to find a single-variable ITR that maximized the time until treatment discontinuation, considering each possible tailoring variable and ranking the tailoring variables by their importance in tailoring medication by their estimated maximum value. For each tailoring variable T, we considered ITRs of the form dθ(T)=1{T<θ}, that is, treat with atypical if and only if T < θ, and dθ(T)=1{T>θ}, that is, treat with atypical if and only if T > θ, where θ was the optimal threshold used for tailoring in the ITR. The ranges of possible thresholds we considered are given in Table 4; these possible threshold ranges were selected based on approximately the 5th and 95th percentiles of the distributions of the tailoring variables. We considered knot locations at each of the θ values, except for the minimum and maximum possible threshold values. We compared the value of each tailored regime, that is, the estimated value associated with the ITR that tailors on a given tailoring variable, to the highest expected outcome of either treat everyone with an atypical antipsychotic or treat everyone with perphenazine. We used the Holm–Bonferroni sequential adjustment to account for multiplicity in our comparisons (Holm, 1979). We used the bootstrap, as is standard for MSM methods, to estimate the variability in the differences in value functions and construct p-values. The p-values may be used to guide hypothesis testing to determine whether there is a significant improvement in the value function with tailoring, and thus whether the ranking is ‘meaningful’ in a statistical sense. We used a one-sided test to assess if the value function of the tailored regime is larger than that of a non-tailored treatment.

TABLE 4

Candidate thresholds (i.e. possible θs) considered for each of the potential tailoring variables in the Clinical Antipsychotics Trial of Intervention and Effectiveness (CATIE) analyses. We consider equally spaced sequences with starting with ‘Min’, up to ‘Max’, increasing by ‘Increment’

MinMaxIncrement
BMI20451.0
Calgary depression scale0150.5
MHCS-1220601.0
PANSS401102.0
PHCS-1220651.0
QOL0.55.50.1
Years of medication prior to CATIE0402.0
MinMaxIncrement
BMI20451.0
Calgary depression scale0150.5
MHCS-1220601.0
PANSS401102.0
PHCS-1220651.0
QOL0.55.50.1
Years of medication prior to CATIE0402.0
TABLE 4

Candidate thresholds (i.e. possible θs) considered for each of the potential tailoring variables in the Clinical Antipsychotics Trial of Intervention and Effectiveness (CATIE) analyses. We consider equally spaced sequences with starting with ‘Min’, up to ‘Max’, increasing by ‘Increment’

MinMaxIncrement
BMI20451.0
Calgary depression scale0150.5
MHCS-1220601.0
PANSS401102.0
PHCS-1220651.0
QOL0.55.50.1
Years of medication prior to CATIE0402.0
MinMaxIncrement
BMI20451.0
Calgary depression scale0150.5
MHCS-1220601.0
PANSS401102.0
PHCS-1220651.0
QOL0.55.50.1
Years of medication prior to CATIE0402.0

We used a logistic regression model to estimate the probability of a participant being assigned to perphenazine or an atypical medication using baseline variables including age, sex, site type (private practice, state clinic, university clinic, veterans affairs or combination), whether patients had been hospitalized, whether the baseline antipsychotic medication was different than the randomized medication, BMI, QOL, MHCS-12, PHCS-12, Calgary depression scale and years of medication prior to CATIE.

The regime of treat everyone with an atypical medication resulted in a value function of 8.35 months (i.e. the average time on the first assigned medication was estimated to be 8.35 months if everyone had been assigned an atypical medication). The regime of treat everyone with perphenazine had an estimated value function of 7.62 months. We only present the results for ITRs of the form dθ(T)=1{T<θ} as the results for all ITRs of the form dθ(T)=1{T>θ} resulted in no tailoring. Figure 2 shows the fitted value functions of applying ERIC for variable selection with ν = 0.25, along with the non-parametric value function estimate for each considered threshold. ERIC selected two knots for BMI at 23 kg/m2 and 30 kg/m2, two knots for QOL at 2 and 3.2, one knot for MHCS-12 at 40, one knot for PHCS-12 at 53, and one knot for years of medication prior to CATIE at 8 years. There were no knots selected for Calgary Depression Score or PANSS; this suggests that these variables are not useful for tailoring treatment. Table 5 shows the rankings based on the maximum estimated value function for the best ITR for each tailoring variable and additionally shows the estimated value function for treat everyone with A = 0 or A = 1 and the rank relative to other candidate single-variable regimes. QOL, MHCS-12 and BMI were the three tailoring variables with the highest estimated value functions. The regime of the form d3.2(QOL)=1{QOL<3.2} was the optimal regime among all candidate thresholds and tailoring variables, resulting in an estimated value function of 8.64.

Estimated value as a function of treatment tailoring threshold. Dots are non-parametric estimates and solid lines are the fitted value function for that tailoring variable using extended regularized information criterion with ν = 0.25 to select the number of knots. Dotted lines are values from the regime of treat everyone with atypical (8.348) and treat everyone with perphenazine (7.623)
FIGURE 2

Estimated value as a function of treatment tailoring threshold. Dots are non-parametric estimates and solid lines are the fitted value function for that tailoring variable using extended regularized information criterion with ν = 0.25 to select the number of knots. Dotted lines are values from the regime of treat everyone with atypical (8.348) and treat everyone with perphenazine (7.623)

TABLE 5

Ranking of tailoring variables and the number of knots selected by ERIC with ν = 0.25. The first two rows correspond to treat everyone with perphenazine and treat everyone with an atypical antipsychotic. The column entitled p, reports the p-value for a one-sided comparison of the individualized treatment rule (ITR) to the highest expected outcome of either treat everyone with an atypical antipsychotic or treat everyone with perphenazine, conducted using the non-parametric bootstrap

Maximum ValueNumber of knots (knot location)Rankp
Treat everyone with perphenazine7.6209
Treat everyone with atypical8.3506
BMI8.482 (23, 30)30.24
Calgary depression scale8.36050.37
MHCS-128.611 (40)20.17
PANSS8.25070.71
PHCS-128.451 (53)40.31
QOL8.642 (2, 3.2)10.15
Years of medication prior to CATIE8.231 (8)80.58
Maximum ValueNumber of knots (knot location)Rankp
Treat everyone with perphenazine7.6209
Treat everyone with atypical8.3506
BMI8.482 (23, 30)30.24
Calgary depression scale8.36050.37
MHCS-128.611 (40)20.17
PANSS8.25070.71
PHCS-128.451 (53)40.31
QOL8.642 (2, 3.2)10.15
Years of medication prior to CATIE8.231 (8)80.58
TABLE 5

Ranking of tailoring variables and the number of knots selected by ERIC with ν = 0.25. The first two rows correspond to treat everyone with perphenazine and treat everyone with an atypical antipsychotic. The column entitled p, reports the p-value for a one-sided comparison of the individualized treatment rule (ITR) to the highest expected outcome of either treat everyone with an atypical antipsychotic or treat everyone with perphenazine, conducted using the non-parametric bootstrap

Maximum ValueNumber of knots (knot location)Rankp
Treat everyone with perphenazine7.6209
Treat everyone with atypical8.3506
BMI8.482 (23, 30)30.24
Calgary depression scale8.36050.37
MHCS-128.611 (40)20.17
PANSS8.25070.71
PHCS-128.451 (53)40.31
QOL8.642 (2, 3.2)10.15
Years of medication prior to CATIE8.231 (8)80.58
Maximum ValueNumber of knots (knot location)Rankp
Treat everyone with perphenazine7.6209
Treat everyone with atypical8.3506
BMI8.482 (23, 30)30.24
Calgary depression scale8.36050.37
MHCS-128.611 (40)20.17
PANSS8.25070.71
PHCS-128.451 (53)40.31
QOL8.642 (2, 3.2)10.15
Years of medication prior to CATIE8.231 (8)80.58

We applied the bootstrap (Efron & Tibshirani, 1994) to quantify the variability of our rankings. We used 10000 bootstrap samples, re-estimating the treatment weights for each bootstrap sample and applying ERIC with ν = 0.25 to select the knots for the fitted value function. Table 6 shows the proportion of times that a tailoring variable or each of the two regimes that treat everyone with the same therapy received a particular ranking over 10000 bootstrap samples. QOL, MHCS-12 and BMI had the highest frequencies of being ranked as one of the top three tailoring variables. This agreed with the estimates and ranking in Table 5 from the original dataset. A figure with the proportion of times that a knot was chosen by ERIC with ν = 0.25 over 10000 bootstrap samples is presented in Appendix D. The knots with the highest selection frequencies in each of the bootstrap samples were similar to those knots selected in the original dataset for the tailoring variables BMI, QOL, MHCS-12, PHCS-12 and years of medication prior to CATIE. Following the Holm–Bonferroni procedure, we first constructed p-values for a one-sided test assessing if the value of the ITRs that used each tailoring variable was higher than the expected time on treatment if everyone were treated with the best non-tailored treatment. The p-value for the ITR using QOL as the tailoring variable was the smallest at 0.15, which is larger than 0.05/7 = 0.007; thus, we conclude that there is no evidence in the CATIE study that using any of the tailoring variables we considered to construct a single-variable ITR offered a longer time on treatment over the untailored strategy of treating everyone with an atypical antipsychotic. It is possible, of course, that other variables not considered here may be important for tailoring or that there exist subgroups defined by combinations of covariates that may benefit from the alternative treatment.

TABLE 6

Bootstrap results of the Clinical Antipsychotics Trial of Intervention and Effectiveness (CATIE) analysis with extended regularized information criterion (ERIC) ν = 0.25. The proportion of times that a tailoring variable is ranked (R1,,R9) together with the average number of knots and the average maximum value function (standard deviation) over 10000 bootstrap samples. The first two lines correspond to treat everyone with perphenazine and treat everyone with an atypical antipsychotic

R1R2R3R4R5R6R7R8R9Number of knotsValue (SD)
Treat everyone with perphenazine0%0%0%0%2%4%7%8%79%NA7.63 (0.77)
Treat everyone with atypical0%0%0%5%17%29%24%9%15%NA8.35 (0.25)
BMI15%15%21%17%14%8%5%4%0%1.448.67 (0.39)
Calgary depression scale6%7%12%16%17%16%16%10%1%0.648.56 (0.45)
MHCS-1235%28%17%10%5%3%2%1%0%1.058.87 (0.45)
PANSS3%7%11%15%17%17%17%11%2%0.858.52 (0.40)
PHCS-123%6%12%16%15%13%16%19%1%1.248.51 (0.44)
QOL31%30%19%11%6%3%1%0%0%1.498.86 (0.46)
Years of medication prior to CATIE7%7%9%10%8%8%12%38%2%0.968.46 (0.57)
R1R2R3R4R5R6R7R8R9Number of knotsValue (SD)
Treat everyone with perphenazine0%0%0%0%2%4%7%8%79%NA7.63 (0.77)
Treat everyone with atypical0%0%0%5%17%29%24%9%15%NA8.35 (0.25)
BMI15%15%21%17%14%8%5%4%0%1.448.67 (0.39)
Calgary depression scale6%7%12%16%17%16%16%10%1%0.648.56 (0.45)
MHCS-1235%28%17%10%5%3%2%1%0%1.058.87 (0.45)
PANSS3%7%11%15%17%17%17%11%2%0.858.52 (0.40)
PHCS-123%6%12%16%15%13%16%19%1%1.248.51 (0.44)
QOL31%30%19%11%6%3%1%0%0%1.498.86 (0.46)
Years of medication prior to CATIE7%7%9%10%8%8%12%38%2%0.968.46 (0.57)
TABLE 6

Bootstrap results of the Clinical Antipsychotics Trial of Intervention and Effectiveness (CATIE) analysis with extended regularized information criterion (ERIC) ν = 0.25. The proportion of times that a tailoring variable is ranked (R1,,R9) together with the average number of knots and the average maximum value function (standard deviation) over 10000 bootstrap samples. The first two lines correspond to treat everyone with perphenazine and treat everyone with an atypical antipsychotic

R1R2R3R4R5R6R7R8R9Number of knotsValue (SD)
Treat everyone with perphenazine0%0%0%0%2%4%7%8%79%NA7.63 (0.77)
Treat everyone with atypical0%0%0%5%17%29%24%9%15%NA8.35 (0.25)
BMI15%15%21%17%14%8%5%4%0%1.448.67 (0.39)
Calgary depression scale6%7%12%16%17%16%16%10%1%0.648.56 (0.45)
MHCS-1235%28%17%10%5%3%2%1%0%1.058.87 (0.45)
PANSS3%7%11%15%17%17%17%11%2%0.858.52 (0.40)
PHCS-123%6%12%16%15%13%16%19%1%1.248.51 (0.44)
QOL31%30%19%11%6%3%1%0%0%1.498.86 (0.46)
Years of medication prior to CATIE7%7%9%10%8%8%12%38%2%0.968.46 (0.57)
R1R2R3R4R5R6R7R8R9Number of knotsValue (SD)
Treat everyone with perphenazine0%0%0%0%2%4%7%8%79%NA7.63 (0.77)
Treat everyone with atypical0%0%0%5%17%29%24%9%15%NA8.35 (0.25)
BMI15%15%21%17%14%8%5%4%0%1.448.67 (0.39)
Calgary depression scale6%7%12%16%17%16%16%10%1%0.648.56 (0.45)
MHCS-1235%28%17%10%5%3%2%1%0%1.058.87 (0.45)
PANSS3%7%11%15%17%17%17%11%2%0.858.52 (0.40)
PHCS-123%6%12%16%15%13%16%19%1%1.248.51 (0.44)
QOL31%30%19%11%6%3%1%0%0%1.498.86 (0.46)
Years of medication prior to CATIE7%7%9%10%8%8%12%38%2%0.968.46 (0.57)

For comparison, we used causal trees (Athey & Imbens, 2016) to estimate heterogeneous treatment effects for each of the same seven potential tailoring variables using non-parametric methods. Similar to the analysis using our approach, causal trees did not find that any variables were useful for tailoring treatment.

5 DISCUSSION

We used penalized linear splines with the tuning parameter selected by ERIC to detect potential knots and estimated the optimal treatment regime using a hybrid approach to rank potential tailoring variables based on the estimated maximum value functions. In simulations we found that this approach can rank the tailoring variables accurately with high probability. We compared our approach with standard LASSO with the tuning parameter selected by BIC. We found that standard LASSO with BIC selected more knots than adaptive LASSO with ERIC, especially as sample size increased. Based on our simulations, we recommend using ERIC with ν = 0.25 for model selection and estimation because it provided correct tailoring variable ranking while maintaining a parsimonious model and, using the hybrid estimation approach, was able to estimate the value function well. We applied the variable selection and ranking method to the CATIE dataset. We found that QOL, MHCS-12 and BMI were the tailoring variables with the highest estimated value functions. In contrast, the Calgary Depression Score and PANSS did not appear to be useful for tailoring treatment.

In the introduction, we noted several variable selection approaches that previous authors (Biernot & Moodie, 2010; Gunter et al., 2007; Gunter et al., 2011a,b) have proposed for ranking tailoring variables. These methods were all developed in the context of regression-based estimation, and so are not appropriate for the value-search framework that we employ. It would be interesting to explore whether ERIC combined with adaptive LASSO would prove equally effective in a regression-based estimation setting.

We proposed a novel data generating approach which varied the peakedness of value function and used this data structure to evaluate our comparator approaches for modelling the value function and ranking tailoring variables. All simulations were conducted in settings where the value function was uni-modal. We do not anticipate any issues in the performance of our proposed ranking approach in settings where the value function is bimodal, as the ranking is based on the maximum estimated value of an ITR. If, by chance, a tailoring variable yielded two identical maxima, this value (or indeed, these two identical values) would serve to rank the candidate tailoring variable. If, on the other hand, one of the modes was local rather than global, the larger of the two values would provide the value on which ranking was performed.

Our goal was to develop a method for ranking tailoring variables for univariate ITRs. In practice, tailoring variables may be correlated, which adds complexity to the process of identifying and ranking the most important variables. When focusing on single-variable ITRs, we purposefully do not account for correlation among potential tailoring variables. If two candidate variables are correlated, it may be that using one variable provides a rule that is effectively tailoring on both, thereby increasing the value of each of those candidate variables over another which is orthogonal to all other tailoring variables. Alternatively, we may wish to consider multi-variable ITRs, where it would make sense to sequentially rank variables. For example, having chosen the most important tailoring variable, we may wish to see which variable(s), if any, would most improve the ITR. A similar strategy was employed in the regression-based context by Fan et al. (2016).

Our approach does not use data-splitting. When building prediction models, splitting data is one of several possible approaches to avoid optimism when estimating performance (Efron, 1983; Steyerberg, 2019; Steyerberg et al., 2001). An important consequence of the split-sample approach is reduced power. Particularly in sequential multiple assignment randomized trials, which are resource intensive to conduct, reducing power by using a split-sample approach is unfavourable. Avoiding optimism in prediction model evaluation, that is, biased performance estimates which exaggerate the performance of the model in an independent dataset, is an important concern and is analogous to avoiding increased chance of type 1 error in inferential statistics. In our application to rank possible tailoring variables for treating schizophrenia, we use a p-value adjustment to avoid increased type 1 error.

This paper focused on a continuous outcome with a weighted least squares loss function. It is possible to extend the variable ranking framework for ITRs to binary and survival outcomes by replacing the loss function with the logistic and Cox loss function respectively. We also hope to extend this framework to construct dynamic treatment regimes, which consider multiple time points and treatment decisions. However, there are many challenges for constructing dynamic treatment regimes in the longitudinal setting. In addition to an increased risk of problems such as missing data and loss to follow-up, there may not be enough data to estimate the value function reliably for some regimes since we may not have enough people who follow these regimes for the entire time. As noted above, dynamic MSMs have typically been used for ITRs or monotonic treatment regimes (‘when to start’).

We applied our methodology to a randomized study comparing antipsychotic medication for individuals with schizophrenia. We anticipate that our approach would carry over to the observational setting where confounding may arise—including those situations where, due to non-compliance, a randomized trial effectively gives rise to observational data when considering the treatment that was taken rather than the treatment that was assigned. The approach may also uncover valuable treatment strategies that would be overlooked if considering only average treatment effects of particular treatments when applied to the population as a whole. However, if the confounders themselves must also be selected or determined from the data (perhaps because the treatment allocation mechanisms are poorly understood or the set of potential confounders is high dimensional), it may be necessary to adapt ERIC to account for that added component of the selection and estimation process.

CONFLICT OF INTEREST

Dr. Shortreed has worked on grants awarded to Kaiser Permanente Washington Health Research Institute (KPWHRI) by Bristol Meyers Squibb and by Pfizer. She was also a co-Investigator on grants awarded to KPWHRI from Syneos Health, who represented a consortium of pharmaceutical companies carrying out FDA-mandated studies regarding the safety of extended-release opioids.

ACKNOWLEDGEMENT

Research reported in this publication was supported by the National Institute of Mental Health of the National Institutes of Health under Award Number R01 MH114873. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Nina Galanter was supported by the University of Washington's Biostatistics, Epidemiology, and Bioinformatics Training in Environmental Health (BEBTEH), grant number T32ES015459, from the National Institute for Environmental Health Science (NIEHS). Erica E. M. Moodie acknowledges the support of a chercheur-boursier senior career award from the Fonds de Recherche du Québec, Santé.

REFERENCES

Addington
,
D.
,
Addington
,
J.
&
Schissel
,
B.
(
1990
)
A depression rating scale for schizophrenics
.
Schizophrenia Research
,
3
,
247
251
.

Athey
,
S.
&
Imbens
,
G.
(
2016
)
Recursive partitioning for heterogeneous causal effects
.
Proceedings of the National Academy of the Sciences
,
113
,
7353
7360
.

Biernot
,
P.
&
Moodie
,
E.E.M.
(
2010
)
A comparison of variable selection approaches for dynamic treatment regimes
.
The International Journal of Biostatistics
,
6
, Article 6.

Cain
,
L.E.
,
Robins
,
J.M.
,
Lanoy
,
E.
,
Logan
,
R.
,
Costagliola
,
D.
&
Hernán
,
M.A.
(
2010
)
When to start treatment? A systematic approach to the comparison of dynamic regimes using observational data
.
The International Journal of Biostatistics
,
6
, Article 18.

Caniglia
,
E.C.
,
Cain
,
L.E.
,
Sabin
,
C.A.
,
Robins
,
J.M.
,
Logan
,
R.
,
Abgrall
,
S.
et al. (
2017
)
Comparison of dynamic monitoring strategies based on CD4 cell counts in virally suppressed, HIVpositive individuals on combination antiretroviral therapy in high-income countries: a prospective, observational study
.
The Lancet HIV
,
4
,
e251
e259
.

Cotton
,
C.A.
&
Heagerty
,
P.J.
(
2014
)
Evaluating epoetin dosing strategies using observational longitudinal data
.
The Annals of Applied Statistics
,
8
,
2356
2377
.

Efron
,
B.
(
1983
)
Estimating the error rate of a prediction rule: Improvement on crossvalidation
.
Journal of the American Statistical Association
,
78
,
316
331
.

Efron
,
B.
&
Tibshirani
,
R.J.
(
1994
)
An introduction to the bootstrap
.
New York
:
Chapman & Hall
.

Ewings
,
F.M.
,
Ford
,
D.
,
Walker
,
A.S.
,
Carpenter
,
J.
&
Copas
,
A.
(
2014
)
Optimal CD4 count for initiating HIV treatment: impact of CD4 observation frequency and grace periods, and performance of dynamic marginal structural models
.
Epidemiology
,
25
,
194
202
.

Fan
,
A.
,
Lu
,
W.
&
Song
,
R.
(
2016
)
Sequential advantage selection for optimal treatment regime
.
The Annals of Applied Statistics
,
10
,
32
53
.

Ford
,
D.
,
Robins
,
J.M.
,
Petersen
,
M.L.
,
Gibb
,
D.M.
,
Gilks
,
C.F.
,
Mugyenyi
,
P.
et al. (
2015
)
The impact of different CD4 cell-count monitoring and switching strategies on mortality in HIVinfected African adults on antiretroviral therapy: an application of dynamic marginal structural models
.
American Journal of Epidemiology
,
182
,
633
643
.

Garcia-Albeniz
,
X.
,
Chan
,
J.M.
,
Paciorek
,
A.
,
Logan
,
R.W.
,
Kenfield
,
S.A.
,
Cooperberg
,
M.R.
et al. (
2015
)
Immediate versus deferred initiation of androgen deprivation therapy in prostate cancer patients with PSA-only relapse. An observational follow-up study
.
European Journal of Cancer
,
51
,
817
824
.

Gunter
,
L.
,
Zhu
,
J.
&
Murphy
,
S.
(
2007
)
Variable selection for optimal decision making
. In
Conference on Artificial Intelligence in Medicine in Europe
,
149
154
.
New York
:
Springer
.

Gunter
,
L.
,
Zhu
,
J.
&
Murphy
,
S.A.
(
2011a
)
Variable selection for qualitative interactions in personalized medicine while controlling the family-wise error rate
.
Journal of Biopharmaceutical Statistics
,
21
,
1063
1078
.

Gunter
,
L.
,
Zhu
,
J.
&
Murphy
,
S.A.
(
2011b
)
Variable selection for qualitative interactions
.
Statistical Methodology
,
8
,
42
55
.

Heinrichs
,
D.W.
,
Hanlon
,
T.E.
&
Carpenter
,
W.T.
(
1984
)
The quality of life scale: an instrument for rating the schizophrenic deficit syndrome
.
Schizophrenia Bulletin
,
10
,
388
398
.

Henderson
,
R.
,
Ansell
,
P.
&
Alshibani
,
D.
(
2010
)
Regret-regression for optimal dynamic treatment regimes
.
Biometrics
,
66
,
1192
1201
.

Hernán
,
M.A.
,
Lanoy
,
E.
,
Costagliola
,
D.
&
Robins
,
J.M.
(
2006
)
Comparison of dynamic treatment regimes via inverse probability weighting
.
Basic & Clinical Pharmacology & Toxicology
,
98
,
237
242
.

Hirano
,
K.
,
Imbens
,
G.W.
&
Ridder
,
G.
(
2003
)
Efficient estimation of average treatment effects using the estimated propensity score
.
Econometrica
,
71
,
1161
1189
.

HIV-Causal Collaboration
(
2011
)
When to initiate combined antiretroviral therapy to reduce mortality and AIDS-defining illness in HIV-infected persons in developed countries: an observational study
.
Annals of Internal Medicine
,
154
,
509
515
.

Holm
,
S.
(
1979
)
A simple sequentially rejective multiple test procedure
.
Scandinavian Journal of Statistics
,
6
,
65
70
.

Hui
,
F.K.
,
Warton
,
D.I.
&
Foster
,
S.D.
(
2015
)
Tuning parameter selection for the adaptive Lasso using ERIC
.
Journal of the American Statistical Association
,
110
,
262
269
.

Kay
,
S.R.
,
Fiszbein
,
A.
&
Opler
,
L.A.
(
1987
)
The positive and negative syndrome scale (PANSS) for schizophrenia
.
Schizophrenia Bulletin
,
13
,
261
276
.

Laber
,
E.B.
&
Zhao
,
Y.Q.
(
2015
)
Tree-based methods for individualized treatment regimes
.
Biometrika
,
102
,
501
514
.

Lu
,
W.
,
Zhang
,
H.H.
&
Zeng
,
D.
(
2013
)
Variable selection for optimal treatment decision
.
Statistical Methods in Medical Research
,
22
,
493
504
.

Murphy
,
S.A.
(
2003
)
Optimal dynamic treatment regimes
.
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
,
65
,
331
355
.

Murphy
,
S.A.
(
2005
)
A generalization error for Q-learning
.
Journal of Machine Learning Research
,
6
,
1073
1097
.

Neugebauer
,
R.
,
Fireman
,
B.
,
Roy
,
J.A.
,
O'Connor
,
P.J.
&
Selby
,
J.V.
(
2012
)
Dynamic marginal structural modeling to evaluate the comparative effectiveness of more or less aggressive treatment intensification strategies in adults with type 2 diabetes
.
Pharmacoepidemiology and Drug Safety
,
21
,
99
113
.

Neugebauer
,
R.
,
Schmittdiel
,
J.A.
,
Zhu
,
Z.
,
Rassen
,
J.A.
,
Seeger
,
J.D.
&
Schneeweiss
,
S.
(
2015
)
High-dimensional propensity score algorithm in comparative effectiveness research with time-varying interventions
.
Statistics in Medicine
,
34
,
753
781
.

Orellana
,
L.
,
Rotnitzky
,
A.
&
Robins
,
J.M.
(
2010a
)
Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part I: main content
.
The International Journal of Biostatistics
,
6
, Article 8.

Orellana
,
L.
,
Rotnitzky
,
A.
&
Robins
,
J.M.
(
2010b
)
Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part II: proofs of results
.
The International Journal of Biostatistics
,
6
, Article 9.

Qian
,
M.
&
Murphy
,
S.A.
(
2011
)
Performance guarantees for individualized treatment rules
.
Annals of Statistics
,
39
,
1180
1210
.

Robins
,
J.M.
(
2004
)
Optimal structural nested models for optimal sequential decisions
. In
Proceedings of the Second Seattle Symposium in Biostatistics
,
189
326
.
New York
:
Springer
.

Robins
,
J.
,
Orellana
,
L.
&
Rotnitzky
,
A.
(
2008
)
Estimation and extrapolation of optimal treatment and testing strategies
.
Statistics in Medicine
,
27
,
4678
4721
.

Rohr
,
J.K.
,
Ive
,
P.
,
Horsburgh
,
C.R.
,
Berhanu
,
R.
,
Shearer
,
K.
,
Maskew
,
M.
et al. (
2016
)
Marginal structural models to assess delays in second-line HIV treatment initiation in South Africa
.
PloS One
,
11
,
e0161469
.

Schwarz
,
G.
(
1978
)
Estimating the dimension of a model
.
The Annals of Statistics
,
6
,
461
464
.

Shen
,
J.
,
Wang
,
L.
&
Taylor
,
J.M.G.
(
2017
)
Estimation of the optimal regime in treatment of prostate cancer recurrence from observational data using flexible weighting models
.
Biometrics
,
73
,
635
645
.

Shepherd
,
B.E.
,
Jenkins
,
C.A.
,
Rebeiro
,
P.F.
,
Stinnette
,
S.E.
,
Bebawy
,
S.S.
,
McGowan
,
C.C.
et al. (
2010
)
Estimating the optimal CD4 count for HIV-infected persons to start antiretroviral therapy
.
Epidemiology (Cambridge, Mass.)
,
21
,
698
.

Shi
,
C.
,
Fan
,
A.
,
Song
,
R.
&
Lu
,
W.
(
2018
)
High-dimensional A-learning for optimal dynamic treatment regimes
.
The Annals of Statistics
,
46
,
925
957
.

Shortreed
,
S.M.
&
Moodie
,
E.E.M.
(
2012
)
Estimating the optimal dynamic antipsychotic treatment regime: evidence from the sequential multiple-assignment randomized clinical antipsychotic trials of intervention and effectiveness schizophrenia study
.
Journal of the Royal Statistical Society: Series C (Applied Statistics)
,
61
,
577
599
.

Shortreed
,
S.M.
,
Laber
,
E.
,
Stroup
,
T.S.
,
Pineau
,
J.
&
Murphy
,
S.A.
(
2014
)
Multiple imputation for sequential multiple assignment randomized trials
.
Statistics in Medicine
,
33
,
4202
4214
.

Sjölander
,
A.
,
Nyrén
,
O.
,
Bellocco
,
R.
&
Evans
,
M.
(
2011
)
Comparing different strategies for timing of dialysis initiation through inverse probability weighting
.
American Journal of Epidemiology
,
174
,
1204
1210
.

Song
,
R.
,
Kosorok
,
M.
,
Zeng
,
D.
,
Zhao
,
Y.
,
Laber
,
E.
&
Yuan
,
M.
(
2015a
)
On sparse representation for optimal individualized treatment selection with penalized outcome weighted learning
.
Stat
,
4
,
59
68
.

Song
,
R.
,
Wang
,
W.
,
Zeng
,
D.
&
Kosorok
,
M.R.
(
2015b
)
Penalized Q-learning for dynamic treatment regimens
.
Statistica Sinica
,
25
,
901
920
.

Steyerberg
,
E.
(
2019
)
Clinical prediction models
, 2nd edn.
Switzerland
:
Springer
.

Steyerberg
,
E.
,
Harrell
, Jr,
F.E.
,
Borsboom
,
G.
,
Eijkemans
,
M.
,
Vergouwe
,
Y.
&
Habbema
,
J.
(
2001
)
Internal validation of predictive models: efficiency of some procedures for logistic regression analysis
.
Journal of Clinical Epidemiology
,
54
,
774
781
.

Stroup
,
T.S.
,
McEvoy
,
J.P.
,
Swartz
,
M.S.
,
Byerly
,
M.J.
,
Glick
,
I.D.
,
Canive
,
J.M.
et al. (
2003
)
The national institute of mental health clinical antipsychotic trials of intervention effectiveness (CATIE) project: schizophrenia trial design and protocol development
.
Schizophrenia Bulletin
,
29
,
15
31
.

Tao
,
Y.
,
Wang
,
L.
&
Almirall
,
D.
(
2018
)
Tree-based reinforcement learning for estimating optimal dynamic treatment regimes
.
The Annals of Applied Statistics
,
12
,
1914
1938
.

Tibshirani
,
R.
(
1996
)
Regression shrinkage and selection via the Lasso
.
Journal of the Royal Statistical Society: Series B (Methodological)
,
58
,
267
288
.

Van der Laan
,
M.J.
&
Petersen
,
M.L.
(
2007
)
Causal effect models for realistic individualized treatment and intention to treat rules
.
The International Journal of Biostatistics
,
3
, Article 3.

Van der Laan
,
M.J.
&
Robins
,
J.M.
(
2003
)
Unified methods for censored longitudinal data and causality
.
New York
:
Springer
.

Wallace
,
M.P.
&
Moodie
,
E.E.M.
(
2015
)
Doubly-robust dynamic treatment regimen estimation via weighted least squares
.
Biometrics
,
71
,
636
644
.

Ware
Jr,
J.E.
,
Kosinski
,
M.
&
Keller
,
S.D.
(
2002
)
SF-12: how to score the SF-12 physical and mental health summary scales
.
Boston
:
Qualitymetric inc. Health Assessment Lab
.

Watkins
,
C.J.C.H.
&
Dayan
,
P.
(
1992
)
Q-learning
.
Machine Learning
,
8
,
279
292
.

Zhang
,
Y.
,
Thamer
,
M.
,
Kaufman
,
J.
,
Cotter
,
D.
&
Hernán
,
M.
(
2014
)
Comparative effectiveness of two anemia management strategies for complex elderly dialysis patients
.
Medical Care
,
52
,
S132
.

Zhao
,
Y.
,
Zeng
,
D.
,
Rush
,
A.J.
&
Kosorok
,
M.R.
(
2012
)
Estimating individualized treatment rules using outcome weighted learning
.
Journal of the American Statistical Association
,
107
,
1106
1118
.

Zhou
,
X.
,
Mayer-Hamblett
,
N.
,
Khan
,
U.
&
Kosorok
,
M.R.
(
2017
)
Residual weighted learning for estimating individualized treatment rules
.
Journal of the American Statistical Association
,
112
,
169
187
.

Zhu
,
R.
,
Zhao
,
Y.-Q.
,
Chen
,
G.
,
Ma
,
S.
&
Zhao
,
H.
(
2017
)
Greedy outcome weighted tree learning of optimal personalized treatment rules
.
Biometrics
,
73
,
391
400
.

Zhu
,
W.
,
Zeng
,
D.
&
Song
,
R.
(
2019
)
Proper inference for value function in highdimensional Q-learning for dynamic treatment regimes
.
Journal of the American Statistical Association
,
114
,
1404
1417
.

Zou
,
H.
(
2006
)
The adaptive Lasso and its oracle properties
.
Journal of the American Statistical Association
,
101
,
1418
1429
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data