-
PDF
- Split View
-
Views
-
Cite
Cite
Yunro Chung, Anastasia Ivanova, Michael G Hudgens, Jason P Fine, Partial likelihood estimation of isotonic proportional hazards models, Biometrika, Volume 105, Issue 1, March 2018, Pages 133–148, https://doi.org/10.1093/biomet/asx064
Close -
Share
Summary
We consider the estimation of the semiparametric proportional hazards model with an unspecified baseline hazard function where the effect of a continuous covariate is assumed to be monotone. Previous work on nonparametric maximum likelihood estimation for isotonic proportional hazard regression with right-censored data is computationally intensive, lacks theoretical justification, and may be prohibitive in large samples. In this paper, partial likelihood estimation is studied. An iterative quadratic programming method is considered, which has performed well with likelihoods for isotonic parametric regression models. However, the iterative quadratic programming method for the partial likelihood cannot be implemented using standard pool-adjacent-violators techniques, increasing the computational burden and numerical instability. The iterative convex minorant algorithm which uses pool-adjacent-violators techniques has also been shown to perform well in related parametric likelihood set-ups, but evidences computational difficulties under the proportional hazards model. An alternative pseudo-iterative convex minorant algorithm is proposed which exploits the pool-adjacent-violators techniques, is theoretically justified, and exhibits computational stability. A separate estimator of the baseline hazard function is provided. The algorithms are extended to models with time-dependent covariates. Simulation studies demonstrate that the pseudo-iterative convex minorant algorithm may yield orders-of-magnitude reduction in computing time relative to the iterative quadratic programming method and the iterative convex minorant algorithm, with moderate reductions in the bias and variance of the estimators. Analysis of data from a recent HIV prevention study illustrates the practical utility of the isotonic methodology in estimating nonlinear, monotonic covariate effects.
1. Introduction
In regression analysis, common parametric models, such as generalized linear models, may employ shape restrictions on covariate effects, the simplest being that of monotonicity. There is an extensive literature on nonparametric isotonic regression models, where the form of a monotone covariate effect is completely unspecified; see Banerjee (2007). Computational and inferential issues have been well studied, particularly for likelihood-based estimation of isotonic generalized linear models, where efficient algorithms are available which exploit the geometric properties of the shape-restricted likelihood and which facilitate a careful theoretical analysis of the large-sample properties of the resulting estimators. Unfortunately, these approaches are not easily generalizable to partial likelihood estimation of the semiparametric isotonic proportional hazards model, owing to the lack of an independent and identically distributed structure of the partial likelihood. In survival data settings, constrained nonparametric maximum likelihood estimation was developed by Ancukiewicz et al. (2003) using ad hoc algorithms. Such algorithms may not even converge to a local maximum, and appear to be computationally prohibitive in large samples. The goal of this paper is to undertake theoretically justified computation of isotonic estimators based on partial likelihood in survival data settings.
The closest related work with right-censored data is for nonparametric estimation of the hazard function subject to shape constraints in the absence of covariates. Various authors have studied maximum likelihood estimation of a hazard function assumed to be monotone in time (Grenander, 1956; Marshall & Proschan, 1965; Rao, 1970; Mukerjee & Wang, 1993; Huang & Wellner, 1995; Banerjee, 2008; Lopuhaä & Nane, 2013 and a 2013 Delft University of Technology PhD thesis by G. F. Nane). With categorical covariates having regression parameters that are known to satisfy a monotone ordering, one may post-process unrestricted partial likelihood estimates using the pool-adjacent-violators algorithm (Ayer et al., 1955) to obtain restricted estimators, similar to post-processing of likelihood estimators of parametric regression models with categorical covariates. This approach is not applicable to continuous covariates, because unrestricted estimation is not possible at all values of the covariate. Specialized methods are needed.
Unlike the usual likelihood-based formulation for isotonic linear models (Robertson et al., 1988), the partial likelihood for the isotonic proportional hazards model is a product integral of terms depending on both time and covariate values, where the parameter |$\phi(\cdot)$| only enters the partial likelihood at those covariate values in the dataset. To ensure that estimation is well-defined between those values, we restrict the estimator to be piecewise constant, which yields a unique estimator with potential jumps at the observed |$Z_{i}$| values. This assumption is similar to that made in isotonic generalized linear models. For right-censored data, we show that the estimator jumps only at those covariate values which are associated with observed failure events with |$\Delta_{i}=1$|; this is made precise in § 2.
Calculating the constrained partial likelihood estimator is challenging and does not follow directly from earlier likelihood analyses of isotonic generalized linear models. The iterative quadratic programming method for |$\ell(\phi)$| can be employed to find the constrained estimator, but cannot be efficiently implemented using the pool-adjacent-violators algorithm, may be computationally prohibitive in large samples, and may exhibit poor convergence properties. The iterative convex minorant algorithm is also theoretically justified and has been shown to reduce the computational burden in many isotonic estimation problems, but exhibits similar difficulties in our setting. To overcome these issues, we propose the pseudo-iterative convex minorant algorithm which finds the constrained partial likelihood estimator by iteratively minimizing a constrained negative pseudo partial likelihood. The convergence properties of the pseudo-iterative convex minorant algorithm can be established similar to those for the iterative convex minorant algorithm.
2. Constrained partial likelihood estimation
2.1. Iterative quadratic and iterative convex minorant algorithms without censoring
If there is no censoring, then the negative log partial likelihood |$\dot{\ell}^{N}(\psi)$| is convex. It is strictly convex under the anchor constraint |$\psi_{k}=\psi(Z_{(k)})=0$|.
Let |$\Psi^{k}$| be |$\{\psi \in \mathbb{R}^{n} : \psi_{1} \leq \cdots \leq \psi_{n} ,\; \psi_{k}=0\}$|. Maximizing the partial likelihood over the monotonicty and anchor constraints is equivalent to minimizing the strictly convex function |$\dot{\ell}^{N}(\psi)$| over the convex cone |$\Psi^{k}$|. We denote the minimizer of |$\dot{\ell}^{N}(\psi)$| over |$\Psi^{k}$| by |$\hat{\psi}=(\hat{\psi}_{1},\ldots,\hat{\psi}_{n})$|, which we refer to as the isotonic partial likelihood estimator.
To uniquely estimate |$\psi$| at covariate values other than those in |$Z_{(1)}, \ldots, Z_{(n)}$|, we assume, similar to previous work on isotonic regression, that the estimator is a right-continuous step function with jumps at the order statistics of the |$Z_{i}$|. Under this assumption, the strict convexity in Theorem 1 coupled with the following theorem gives a unique characterization of the isotonic partial likelihood estimator.
Moreover, |$\hat{\psi}$| is uniquely determined by (2) and (3).
2.2. Pseudo-iterative convex minorant algorithm without censoring
Choose an initial value of |$\dot{\psi}^{(0)} \in \Psi^{k}$|, or |$\dot{\psi}^{(0)} \in \Psi$|.
Update |$\dot{\psi}^{(m)}$| such that |$\dot{\psi}^{(m)}={\rm{arg}}\,{\rm{min}}_{\psi \in \Psi}\dot{\ell}^{P}(\psi\mid\nu=\dot{\psi}^{(m-1)})$|.
Repeat Step 2 until convergence under the distance stopping criterion |$d_{e}(\dot{\psi}^{(m)}, \dot{\psi}^{(m-1)})<\dot{\epsilon}$| for small |$\dot{\epsilon}>0$|, where |$d_{e}(x,y)=\sum_{i=1}^{n}|\exp(x_{i})-\exp(y_{i})|$|.
Let |$\ddot{\psi}_{i} = \dot{\psi}^{(m)}_{i} - \dot{\psi}^{(m)}_{k}$| (|$i=1,\ldots,n$|) such that |$\ddot \psi=(\ddot \psi_1,\ldots,\ddot \psi_n) \in \Psi^k$|.
The uniqueness of |$\dot{\psi}^{(m)}$| in Step 2 is provided by Theorem 3. Theorem 4 shows that the simple structure of |$\dot{\ell}^{P}(\psi\!\mid\!\nu)$| allows a closed-form solution for the unique |$\dot{\psi}^{(m)}$| satisfying Fenchel’s duality condition in (8) and (9), which can be easily computed using the pool-adjacent-violators algorithm with the logarithmic transformation.
Moreover, |$\dot{\psi}$| is uniquely determined by (8) and (9).
Suppose there is no censoring and that |$\hat{\psi}^{+}$| minimizes |$\sum_{i=1}^{n}(\psi^{+}_{i}-w_{i}^{-1}\Delta_{i})^{2}w_{i}$| over the class of monotone increasing functions in |$\Psi$|, where |$w_{i}=\int_{0}^{\infty}Y_{i}(u)\,{\rm d}\bar{N}(u)/ \sum_{j=1}^{n}Y_{j}(u)\exp(\nu_{j})$|. Then |$\dot{\psi}=(\log \hat{\psi}^{+}_{1},\ldots,\log\hat{\psi}^{+}_{n})$| is the unique minimizer of |$\dot{\ell}^{P}(\psi\mid\nu)$| over |$\Psi$|.
There is no guarantee that eventually the convergence criteria in Step 3 will be satisfied, i.e., it is possible to construct datasets and choose starting values such that the algorithm will not converge. However, in practice we have found this to be relatively rare; see § 4. Moreover, the following theorem indicates that if the algorithm does converge for all |$\dot \epsilon>0$|, then |$\ddot \psi$| converges to the unique minimizer of the constrained partial likelihood in Theorems 1 and 2 as |$\dot \epsilon \to 0$|. This provides theoretical justification for the pseudo-iterative convex minorant algorithm in the no-censoring case.
Suppose there is no censoring and that for all |$\dot{\epsilon}>0$| there exists |$r(\dot{\epsilon})$| such that the pseudo-iterative convex minorant algorithm converges at the |$r(\dot{\epsilon})$|th iteration under the distance stopping criterion |$d_{e}[\dot{\psi}^{\{r(\dot{\epsilon})\}},\dot{\psi}^{\{r(\dot{\epsilon})-1\}}]<\dot{\epsilon}$|. Suppose there exists a constant |$\kappa>0$| such that |$| \dot{\psi}^{\{r(\dot{\epsilon})\}}_{j} | < \kappa$||$(j=1,\ldots,n)$| for all |$\dot{\epsilon}>0$|. Then, as |$\dot{\epsilon} \to 0$|, |$\ddot{\psi}=[ \dot{\psi}^{\{r(\dot{\epsilon})\}}_{1} - \dot{\psi}^{\{r(\dot{\epsilon})\}}_{k},\ldots,\dot{\psi}^{\{r(\dot{\epsilon})\}}_{n} - \dot{\psi}^{\{r(\dot{\epsilon})\}}_{k}]$| converges to the unique minimizer of |$\dot{\ell}^{N}(\psi)$| over |$\Psi^{k}$|.
2.3. Censoring
Suppose that some failure times may be censored. That censored subjects contribute limited information to the partial likelihood restricts the form of the isotonic partial likelihood estimator. As stated in Proposition 1, the isotonic partial likelihood estimator has jumps only at the covariate values associated with uncensored subjects. This occurs because the form of the partial likelihood forces the isotonic estimator to be as small as possible at covariate values for uncensored subjects, while still satisfying the monotonicity constraint. This is similar to the Kaplan–Meier estimator of the survival function, which only jumps at the observed failure times (Kaplan & Meier, 1958). For covariate values less than the smallest covariate value for the uncensored subjects, the estimator equals |$-\infty$|. To estimate |$\psi(\cdot)$|, the parameter for a censored subject is replaced with the parameter for an uncensored subject having a covariate value which is closest to that for the censored subject amongst all uncensored subjects having smaller covariate values than the censored subject. Censored subjects with covariate values less than the smallest covariate value amongst all uncensored subjects are not included in the computation. This yields the isotonic estimator at covariate values larger than the smallest covariate value for uncensored subjects, with the estimator equal to |$-\infty$| below that covariate value.
The isotonic partial likelihood estimator of |$\psi(z)$| has jumps only at |$Z^{\star}_1,\ldots,Z^{\star}_{n^{\star}}$|, equals |$-\infty$| for |$z<Z_{(1)}^{\star}$|, and equals |$\hat \psi^{\star}_i$| for |$z \in I_i^{\star}$| where |$ (\hat \psi^{\star}_1,\ldots, \hat \psi^{\star}_{n^{\star}})$| is the unique maximizer of |$\ell^{C}(\psi^{\star})$| over |$\Psi^k_{n^{\star}}$|.
Since |$\ell^{C}(\psi^{\star})$| has the same form as |$\ell(\psi)$|, the iterative quadratic programming method, iterative convex minorant algorithm, and pseudo-iterative convex minorant algorithm can be used to find the unique maximizer of |$\ell^{C}(\psi^{\star})$|.
2.4. Time-dependent covariate
Consider the model |$\lambda\{t\mid Z (s), s \leq t\}=\lambda_0(t)\exp[\psi\{Z(t)\}]$|, where |$Z(t)$| is a time-dependent covariate. Assume that the monotone increasing function |$\psi(\cdot)$| does not depend on time. As with right-censored data, the values of the time-dependent covariate observed at times other than observed failure times contribute limited information to the partial likelihood, restricting the form of the isotonic partial likelihood estimator. As stated in Proposition 2, the isotonic partial likelihood estimator has jumps only at time-dependent covariate values associated with uncensored subjects at their failure times, with the estimator equal to |$-\infty$| at covariate values smaller than the smallest covariate value observed at event times for uncensored subjects. One may use the replacement parameters algorithm where the parameters for subjects having their failure times observed are substituted for other parameters in the partial likelihood. Unlike the censoring case with a time-independent covariate where parameters for censored subjects are replaced, with a time-dependent covariate both censored and uncensored subjects may have parameters replaced.
The isotonic partial likelihood estimator of |$\psi(z)$| has jumps only at |$Z^{\ast}_1,\ldots,Z^{\ast}_{n^{\star}}$|, equals |$-\infty$| for |$z<Z_{(1)}^{\ast}$|, and equals |$\hat \psi^{\ast}_i$| for |$z \in I_i^{\ast}$| where |$ (\hat \psi^{\ast}_1,\ldots, \hat \psi^{\ast}_{n^{\star}})$| is the unique maximizer of |$\ell^{D}(\psi^{\ast})$| over |$\Psi^k_{n^{\star}}$|.
Since |$\ell^{D}(\psi^{\ast})$| has the same form as |$\ell(\psi)$|, the iterative quadratic programming method, iterative convex minorant algorithm, and pseudo-iterative convex minorant algorithm may be applied to find the unique maximizer of |$\ell^{D}(\psi^{\ast})$|.
2.5. Local consistency of the pseudo partial likelihood estimator
|$\mbox{pr}\{Y(t)\}>0$| and |$E\{{\rm d}N(t)\}<\infty$| for |$t \in (0,\tau]$|;
|$p_{z}$| is positive and continuous in a neighbourhood of |$z_{0}$|;
|$\psi_0(\cdot)$| is continuous and differentiable in a neighbourhood of |$z_{0}$| with |$|\psi'_0(z_{0})| >0$|;
let |$L=\inf\{\psi_0(z) : z \in I_{z}\}$| and |$U=\sup\{\psi_0(z) : z \in I_{z}\}$|. Then |$L, U \in \Theta$| with |$-\infty < L < U < \infty$|;
|$E_{\theta_{0}}\{u({\theta_{1}}\!\mid\!\theta_{0})^{2}\}$| is uniformly bounded in a compact rectangle containing |$[L, U] \times [L, U]$| for |$\theta_{0}, \theta_{1} \in \Theta$|;
|$E_{\theta_{0}}\{u({\theta_{1}}\mid\theta_{0})\} \neq E_{\theta_{0}}\{u({\theta_{2}}\mid\theta_{0})\}$| whenever |$\theta_{1} \neq \theta_{2}$|, for |$\theta_{0}, \theta_{1}, \theta_{2} \in \Theta$|.
Assumptions 1–3 are standard in survival and isotonic regression models. By Assumption 4, |$\psi_0(\cdot)$| is bounded between |$L$| and |$U$| on |$I_{z}$|. Assumptions 5 and 6 are mild in that the expectation of |$u(\cdot)$| is a uniformly bounded and strictly increasing function of |$\theta$|. The uniform strong consistency of the isotonic estimator is proven under Assumptions 1–6, with the result stated in the following theorem. For simplicity the theorem is stated assuming no censoring, but the local consistency also holds in the presence of censoring.
Let |$\dot{\psi}^{(1)}$| be the first-step estimate from the pseudo-iterative convex minorant algorithm with |$\nu_{n,i}=\psi_{0}(Z_{i})+\epsilon_{n,i}$| where |$\epsilon_{n,i}=c_{n,i}/n$| for |$c_{n,i}$| such that |$v_{n,i}$||$(i=1,\ldots, n)$| satisfies the monotonicity constraint, |$l \leq c_{n,i} \leq u$| for fixed |$l$| and |$u$|, and |$-\infty < l \leq u < \infty $|. Let |$\dot{\psi}^{(1)}_{n}(z)=\dot{\psi}^{(1)}_{(i)}$| if |$z \in [Z_{(i)}, Z_{(i+1)})$||$(i=1,\ldots,n)$| with |$Z_{(n+1)}=\infty$|. Let |$z_{0}$| be an interior point of |$I_{z}$|. Then there exist |$\sigma_{1}$| and |$\sigma_{2}$| where |$z_{0}$| falls in the interior of |$[\sigma_{1},\sigma_{2}]$| such that |$\sup\nolimits_{z \in [\sigma_{1},\sigma_{2}]}|\dot{\psi}^{(1)}_n(z)-\psi_{0}(z)| \to 0$| almost surely as |$n \to \infty$|.
This result states that the estimator based on a one-step update is consistent if the initial value is in an |$n^{-1}$| neighbourhood of the true value. The proof, which is given in the Appendix, follows from Lemma 2.1 in the detailed version of Banerjee (2007), which can be found on his webpage with one modification, namely establishing that (A16) in the Supplementary Material converges to zero almost surely. We show that the first term in (A16) converges to zero as |$n \to \infty$| if |$\epsilon_{n,i}$| goes to zero sufficiently fast. We then show the convergence of the second term by using empirical process theory if |$u_{n}\{\psi(Z)\!\mid\!\underline{\psi}_{0}\}$| is a |$P$|-Glivenko–Cantelli function and |$P[Y(t)\exp\{\psi_{0}(Z)\}]$| is bounded away from zero, which is similar to a situation which has been studied for counting process regression (Kosorok, 2007, p. 56). This suggests that if one starts the pseudo-iterative convex minorant algorithm sufficiently close to the true parameters, then the resulting one-step estimator is consistent. This local consistency is similar to what was shown in Chapters 5.1 and 5.2 of Groeneboom & Wellner (1992), where they defined their toy estimator as one step of the iterative convex minorant algorithm starting the iteration at the true parameters. Our one-step estimator is valid under weaker conditions, in the sense that the initial value is not necessarily equal to the true value but rather in an |$n^{-1}$| neighbourhood of the truth.
3. Extensions
3.1. Baseline hazard function
3.2. Additional covariates
4. Simulations
We conducted simulation studies to examine the performance of the iterative quadratic programming method, iterative convex minorant algorithm, and pseudo-iterative convex minorant algorithm. As a gold standard, we also evaluated the pseudo partial likelihood by setting |$\nu$| to the true value in |$\dot{\ell}^{P}(\psi\!\mid\!\nu)$|. For the first part of the simulation studies, we considered a time-independent covariate |$Z$| that was generated from a uniform distribution on |$(0, 1)$|. Three monotone increasing functions on the interval |$(0, 1)$| were considered: |$\phi(Z)=Z$|, |$\phi(Z)=Z^{1/2}$| and |$\phi(Z)=Z^{2}$|. The failure time was then generated from a proportional hazards model with baseline hazard function being exponential with scale parameter |$\alpha=1$|. The same scenarios were used for the second part of the simulation study, with a time-dependent covariate |$Z(t)$|. The time-dependent covariate was piecewise constant. To construct |$Z(t)$|, we generated independent uniform |$(0, 1)$| random variables on disjoint time intervals |$(x_{j-1},x_{j}]$|, where |$x_{0}=0, x_{1}\,{=}\,0 {\cdot}22, x_{2}\,{=}\,0{\cdot}44, \ldots, x_{9}\,{=}\,2, x_{10}\,{=}\,{+}\,\infty$|. Simulations were conducted with no censoring and then repeated with censoring where the censoring times were independently generated from a uniform distribution, giving 30% censoring. We repeated the simulations 500 times with sample sizes 100, 500 and 1000. We set stopping values of |$\epsilon$| and |$\dot{\epsilon}$| to |$10^{-3}$| and |$10^{-5}$|, respectively. Two anchor points were considered, |$K\,{=}\,0{\cdot}5$| and |$K\,{=}\,0$|. For each dataset an initial value of |$\psi^{(0)}_{i}$|, for |$i\,{=}\,1,\ldots,n$|, was set to |$|\hat{\gamma}|\bar{Z}_{i}$|, where |$\bar{Z}_{i}\,{=}\,Z_{(i)}\,{-}\,Z_{(k)}$| and |$\hat{\gamma}$| was the estimated coefficient of |$\bar{Z}_{i}$| from the standard Cox model.
The effect of choosing |$K=0.5$| versus |$K=0$| was evaluated for the iterative quadratic programming method and iterative convex minorant algorithm by comparing the two isotonic estimates. We define the percentage of matches as |${\rm MC}=\sum_{r=1}^{500}{\rm MC}_{r}/500$|, where |${\rm MC}_{r}=1$| if |$\max_{i \in \{1,\ldots,n\}}|\hat{\psi}^{1}_{r}(Z_{i}) - \{\hat{\psi}^{2}_{r}(Z_{i})-\hat{\psi}^{2}_{r}(0\cdot5)\}|< 0\cdot001$|, and |${\rm MC}_{r}=0$| otherwise. Here, |$\hat{\psi}^{k}_{r}(\cdot)$| is an estimated monotone increasing function for the |$r$|th dataset for |$r=1,\ldots,500$|, where |$k=1$| for |$K=0\cdot5$| and |$k=2$| for |$K=0$|. For the pseudo-iterative convex minorant algorithm, by construction, the estimates are the same for all anchor constraints. To evaluate the performance of the different algorithms, we computed the integrated mean squared error |$\int_{0}^{1} E\{\psi^{k}(z)-\hat{\psi}^{k}(z)\}^2\,{\rm d}z$| for |$k=1, 2$|, where |$\psi^{k}(z)=\phi(z)-\phi(K)$|. Based on equally spaced grid points of |$z_{g}$| between |$\max\{0\cdot001,Z^{\star}_{(1)}\}$| and 0|$\cdot$|999, the integrated mean squared error was approximated by |$\sum_{r=1}^{R}\sum_{g=1}^{G} \{\psi^{k}(z_{g}) - \hat{\psi}^{k}_{r}(z_{g})\}^{2}/(GR)$|, with |$G=1000$| grid points and |$R=500$| simulation runs. We also computed the percentage of simulations where convergence was achieved according to Fenchel’s duality condition in Theorem 2.
Table 1 shows the simulation results for time-independent covariates. Similar results for time-dependent covariates are reported in the Supplementary Material. Nonconvergent cases were excluded when calculating the integrated mean squared error, the matched case percentage, and the computing time. The pseudo-iterative convex minorant algorithm has good convergence results in agreement with Theorem 5. In addition, the pseudo-iterative convex minorant algorithm dramatically improves the computational speed, especially with large sample sizes and with time-dependent covariates. As the iterative quadratic programming method needs to calculate the inverse of the full Hessian matrix, computational time increases cubically with the number of observed failure events. Both the iterative quadratic programming method and the iterative convex minorant algorithm fail to converge in roughly 10% of the datasets. Convergence failures of the iterative quadratic programming method were due to computationally singular Hessian matrices during an iteration of the algorithm. The results for the iterative convex minorant algorithm depend heavily on the anchor constraint. In particular, the iterative convex minorant algorithm is extremely slow when the anchor is set to zero. The results for the pseudo-iterative convex minorant algorithm, as well as for the iterative quadratic programming method, do not depend on the anchor constraint. For small sample sizes, the iterative quadratic programming method and iterative convex minorant algorithm have larger integrated mean squared errors than the pseudo-iterative convex minorant algorithm, but the differences vanish as the sample size increases. As expected, the pseudo partial likelihood with known |$\nu$| has the smallest integrated mean squared error.
Simulation results for time-independent covariates: integrated mean squared error multiplied by |$10^{3}$| (and in brackets the median CPU time in seconds), convergence percentage and matched case percentage. The first and second lines for each |$\phi$| and |$n$| correspond to anchor points |$K=0{\cdot}5$| and |$K=0$| respectively
| Cens . | |$\phi(Z)$| . | |$n$| . | IQM . | . | . | ICM . | . | . | PICM . | . | . | PPL . |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | IMSE . | Conv . | MC . | IMSE . | Conv . | MC . | IMSE . | Conv . | MC . | IMSE . |
| 0 | |$Z$| | 100 | 213(2) | 88 | 100 | 747(1) | 91 | 75 | 152(0) | 95 | 100 | 81(0) |
| 213(1) | 88 | 113(3) | 86 | 152(0) | 95 | 81(0) | ||||||
| 500 | 31(134) | 89 | 100 | 63(15) | 89 | 81 | 31(0) | 99 | 100 | 20(0) | ||
| 31(43) | 89 | 28(140) | 87 | 31(0) | 99 | 20(0) | ||||||
| 1000 | 17(827) | 89 | 100 | 26(65) | 90 | 87 | 18(2) | 99 | 100 | 17(1) | ||
| 17(505) | 89 | 17(1279) | 87 | 18(1) | 99 | 17(1) | ||||||
| |$Z^{1/2}$| | 100 | 267(1) | 89 | 100 | 1008(0) | 91 | 82 | 129(0) | 93 | 100 | 80(0) | |
| 267(1) | 89 | 100(3) | 84 | 129(0) | 93 | 80(0) | ||||||
| 500 | 31(145) | 89 | 100 | 91(11) | 89 | 90 | 28(0) | 98 | 100 | 18(0) | ||
| 31(44) | 89 | 25(105) | 87 | 28(0) | 98 | 18(0) | ||||||
| 1000 | 16(1078) | 90 | 100 | 39(54) | 90 | 95 | 15(1) | 99 | 100 | 14(1) | ||
| 16(684) | 90 | 15(1032) | 89 | 15(1) | 99 | 14(1) | ||||||
| |$Z^{2}$| | 100 | 170(1) | 87 | 100 | 329(0) | 91 | 61 | 161(0) | 97 | 100 | 63(0) | |
| 170(1) | 87 | 112(3) | 87 | 161(0) | 97 | 63(0) | ||||||
| 500 | 30(145) | 88 | 100 | 50(14) | 89 | 63 | 31(0) | 99 | 100 | 18(0) | ||
| 30(45) | 88 | 28(315) | 84 | 31(0) | 99 | 18(0) | ||||||
| 1000 | 16(1190) | 88 | 100 | 23(61) | 89 | 66 | 18(1) | 100 | 100 | 17(1) | ||
| 16(573) | 88 | 17(1932) | 78 | 18(1) | 100 | 17(1) | ||||||
| 30 | |$Z$| | 100 | 252(1) | 89 | 100 | 410(0) | 91 | 68 | 165(0) | 99 | 100 | 28(0) |
| 252(0) | 89 | 126(1) | 90 | 165(0) | 99 | 28(0) | ||||||
| 500 | 36(48) | 89 | 100 | 36(8) | 89 | 80 | 39(0) | 100 | 100 | 28(0) | ||
| 36(18) | 89 | 36(118) | 88 | 39(0) | 100 | 28(0) | ||||||
| 1000 | 21(290) | 90 | 100 | 21(39) | 91 | 84 | 22(1) | 100 | 100 | 10(1) | ||
| 21(172) | 90 | 21(561) | 88 | 22(1) | 100 | 10(1) | ||||||
| |$Z^{1/2}$| | 100 | 203(1) | 89 | 100 | 316(0) | 91 | 78 | 141(0) | 98 | 100 | 32(0) | |
| 250(0) | 89 | 113(2) | 89 | 141(0) | 98 | 32(0) | ||||||
| 500 | 32(39) | 89 | 100 | 35(6) | 90 | 90 | 35(0) | 100 | 100 | 27(0) | ||
| 32(19) | 89 | 32(74) | 89 | 35(0) | 100 | 27(0) | ||||||
| 1000 | 18(328) | 90 | 100 | 19(31) | 92 | 95 | 19(1) | 100 | 100 | 10(1) | ||
| 18(238) | 90 | 19(727) | 91 | 19(1) | 100 | 10(1) | ||||||
| |$Z^{2}$| | 100 | 224(1) | 88 | 100 | 264(0) | 91 | 55 | 174(0) | 99 | 100 | 35(0) | |
| 224(0) | 88 | 125(2) | 89 | 174(0) | 99 | 35(0) | ||||||
| 500 | 35(39) | 88 | 100 | 35(6) | 89 | 60 | 38(0) | 100 | 100 | 31(0) | ||
| 35(20) | 88 | 35(118) | 84 | 38(0) | 100 | 31(0) | ||||||
| 1000 | 20(302) | 88 | 100 | 21(32) | 90 | 63 | 21(1) | 100 | 100 | 10(1) | ||
| 20(237) | 88 | 21(905) | 82 | 21(1) | 100 | 10(1) |
| Cens . | |$\phi(Z)$| . | |$n$| . | IQM . | . | . | ICM . | . | . | PICM . | . | . | PPL . |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | IMSE . | Conv . | MC . | IMSE . | Conv . | MC . | IMSE . | Conv . | MC . | IMSE . |
| 0 | |$Z$| | 100 | 213(2) | 88 | 100 | 747(1) | 91 | 75 | 152(0) | 95 | 100 | 81(0) |
| 213(1) | 88 | 113(3) | 86 | 152(0) | 95 | 81(0) | ||||||
| 500 | 31(134) | 89 | 100 | 63(15) | 89 | 81 | 31(0) | 99 | 100 | 20(0) | ||
| 31(43) | 89 | 28(140) | 87 | 31(0) | 99 | 20(0) | ||||||
| 1000 | 17(827) | 89 | 100 | 26(65) | 90 | 87 | 18(2) | 99 | 100 | 17(1) | ||
| 17(505) | 89 | 17(1279) | 87 | 18(1) | 99 | 17(1) | ||||||
| |$Z^{1/2}$| | 100 | 267(1) | 89 | 100 | 1008(0) | 91 | 82 | 129(0) | 93 | 100 | 80(0) | |
| 267(1) | 89 | 100(3) | 84 | 129(0) | 93 | 80(0) | ||||||
| 500 | 31(145) | 89 | 100 | 91(11) | 89 | 90 | 28(0) | 98 | 100 | 18(0) | ||
| 31(44) | 89 | 25(105) | 87 | 28(0) | 98 | 18(0) | ||||||
| 1000 | 16(1078) | 90 | 100 | 39(54) | 90 | 95 | 15(1) | 99 | 100 | 14(1) | ||
| 16(684) | 90 | 15(1032) | 89 | 15(1) | 99 | 14(1) | ||||||
| |$Z^{2}$| | 100 | 170(1) | 87 | 100 | 329(0) | 91 | 61 | 161(0) | 97 | 100 | 63(0) | |
| 170(1) | 87 | 112(3) | 87 | 161(0) | 97 | 63(0) | ||||||
| 500 | 30(145) | 88 | 100 | 50(14) | 89 | 63 | 31(0) | 99 | 100 | 18(0) | ||
| 30(45) | 88 | 28(315) | 84 | 31(0) | 99 | 18(0) | ||||||
| 1000 | 16(1190) | 88 | 100 | 23(61) | 89 | 66 | 18(1) | 100 | 100 | 17(1) | ||
| 16(573) | 88 | 17(1932) | 78 | 18(1) | 100 | 17(1) | ||||||
| 30 | |$Z$| | 100 | 252(1) | 89 | 100 | 410(0) | 91 | 68 | 165(0) | 99 | 100 | 28(0) |
| 252(0) | 89 | 126(1) | 90 | 165(0) | 99 | 28(0) | ||||||
| 500 | 36(48) | 89 | 100 | 36(8) | 89 | 80 | 39(0) | 100 | 100 | 28(0) | ||
| 36(18) | 89 | 36(118) | 88 | 39(0) | 100 | 28(0) | ||||||
| 1000 | 21(290) | 90 | 100 | 21(39) | 91 | 84 | 22(1) | 100 | 100 | 10(1) | ||
| 21(172) | 90 | 21(561) | 88 | 22(1) | 100 | 10(1) | ||||||
| |$Z^{1/2}$| | 100 | 203(1) | 89 | 100 | 316(0) | 91 | 78 | 141(0) | 98 | 100 | 32(0) | |
| 250(0) | 89 | 113(2) | 89 | 141(0) | 98 | 32(0) | ||||||
| 500 | 32(39) | 89 | 100 | 35(6) | 90 | 90 | 35(0) | 100 | 100 | 27(0) | ||
| 32(19) | 89 | 32(74) | 89 | 35(0) | 100 | 27(0) | ||||||
| 1000 | 18(328) | 90 | 100 | 19(31) | 92 | 95 | 19(1) | 100 | 100 | 10(1) | ||
| 18(238) | 90 | 19(727) | 91 | 19(1) | 100 | 10(1) | ||||||
| |$Z^{2}$| | 100 | 224(1) | 88 | 100 | 264(0) | 91 | 55 | 174(0) | 99 | 100 | 35(0) | |
| 224(0) | 88 | 125(2) | 89 | 174(0) | 99 | 35(0) | ||||||
| 500 | 35(39) | 88 | 100 | 35(6) | 89 | 60 | 38(0) | 100 | 100 | 31(0) | ||
| 35(20) | 88 | 35(118) | 84 | 38(0) | 100 | 31(0) | ||||||
| 1000 | 20(302) | 88 | 100 | 21(32) | 90 | 63 | 21(1) | 100 | 100 | 10(1) | ||
| 20(237) | 88 | 21(905) | 82 | 21(1) | 100 | 10(1) |
Cens, censoring percentage; IQM, iterative quadratic programming; ICM, iterative convex minorant algorithm; PICM, pseudo-iterative convex minorant algorithm; PPL, pseudo partial likelihood; IMSE, integrated mean squared error; Conv, convergence percentage; MC, matched case percentage.
Simulation results for time-independent covariates: integrated mean squared error multiplied by |$10^{3}$| (and in brackets the median CPU time in seconds), convergence percentage and matched case percentage. The first and second lines for each |$\phi$| and |$n$| correspond to anchor points |$K=0{\cdot}5$| and |$K=0$| respectively
| Cens . | |$\phi(Z)$| . | |$n$| . | IQM . | . | . | ICM . | . | . | PICM . | . | . | PPL . |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | IMSE . | Conv . | MC . | IMSE . | Conv . | MC . | IMSE . | Conv . | MC . | IMSE . |
| 0 | |$Z$| | 100 | 213(2) | 88 | 100 | 747(1) | 91 | 75 | 152(0) | 95 | 100 | 81(0) |
| 213(1) | 88 | 113(3) | 86 | 152(0) | 95 | 81(0) | ||||||
| 500 | 31(134) | 89 | 100 | 63(15) | 89 | 81 | 31(0) | 99 | 100 | 20(0) | ||
| 31(43) | 89 | 28(140) | 87 | 31(0) | 99 | 20(0) | ||||||
| 1000 | 17(827) | 89 | 100 | 26(65) | 90 | 87 | 18(2) | 99 | 100 | 17(1) | ||
| 17(505) | 89 | 17(1279) | 87 | 18(1) | 99 | 17(1) | ||||||
| |$Z^{1/2}$| | 100 | 267(1) | 89 | 100 | 1008(0) | 91 | 82 | 129(0) | 93 | 100 | 80(0) | |
| 267(1) | 89 | 100(3) | 84 | 129(0) | 93 | 80(0) | ||||||
| 500 | 31(145) | 89 | 100 | 91(11) | 89 | 90 | 28(0) | 98 | 100 | 18(0) | ||
| 31(44) | 89 | 25(105) | 87 | 28(0) | 98 | 18(0) | ||||||
| 1000 | 16(1078) | 90 | 100 | 39(54) | 90 | 95 | 15(1) | 99 | 100 | 14(1) | ||
| 16(684) | 90 | 15(1032) | 89 | 15(1) | 99 | 14(1) | ||||||
| |$Z^{2}$| | 100 | 170(1) | 87 | 100 | 329(0) | 91 | 61 | 161(0) | 97 | 100 | 63(0) | |
| 170(1) | 87 | 112(3) | 87 | 161(0) | 97 | 63(0) | ||||||
| 500 | 30(145) | 88 | 100 | 50(14) | 89 | 63 | 31(0) | 99 | 100 | 18(0) | ||
| 30(45) | 88 | 28(315) | 84 | 31(0) | 99 | 18(0) | ||||||
| 1000 | 16(1190) | 88 | 100 | 23(61) | 89 | 66 | 18(1) | 100 | 100 | 17(1) | ||
| 16(573) | 88 | 17(1932) | 78 | 18(1) | 100 | 17(1) | ||||||
| 30 | |$Z$| | 100 | 252(1) | 89 | 100 | 410(0) | 91 | 68 | 165(0) | 99 | 100 | 28(0) |
| 252(0) | 89 | 126(1) | 90 | 165(0) | 99 | 28(0) | ||||||
| 500 | 36(48) | 89 | 100 | 36(8) | 89 | 80 | 39(0) | 100 | 100 | 28(0) | ||
| 36(18) | 89 | 36(118) | 88 | 39(0) | 100 | 28(0) | ||||||
| 1000 | 21(290) | 90 | 100 | 21(39) | 91 | 84 | 22(1) | 100 | 100 | 10(1) | ||
| 21(172) | 90 | 21(561) | 88 | 22(1) | 100 | 10(1) | ||||||
| |$Z^{1/2}$| | 100 | 203(1) | 89 | 100 | 316(0) | 91 | 78 | 141(0) | 98 | 100 | 32(0) | |
| 250(0) | 89 | 113(2) | 89 | 141(0) | 98 | 32(0) | ||||||
| 500 | 32(39) | 89 | 100 | 35(6) | 90 | 90 | 35(0) | 100 | 100 | 27(0) | ||
| 32(19) | 89 | 32(74) | 89 | 35(0) | 100 | 27(0) | ||||||
| 1000 | 18(328) | 90 | 100 | 19(31) | 92 | 95 | 19(1) | 100 | 100 | 10(1) | ||
| 18(238) | 90 | 19(727) | 91 | 19(1) | 100 | 10(1) | ||||||
| |$Z^{2}$| | 100 | 224(1) | 88 | 100 | 264(0) | 91 | 55 | 174(0) | 99 | 100 | 35(0) | |
| 224(0) | 88 | 125(2) | 89 | 174(0) | 99 | 35(0) | ||||||
| 500 | 35(39) | 88 | 100 | 35(6) | 89 | 60 | 38(0) | 100 | 100 | 31(0) | ||
| 35(20) | 88 | 35(118) | 84 | 38(0) | 100 | 31(0) | ||||||
| 1000 | 20(302) | 88 | 100 | 21(32) | 90 | 63 | 21(1) | 100 | 100 | 10(1) | ||
| 20(237) | 88 | 21(905) | 82 | 21(1) | 100 | 10(1) |
| Cens . | |$\phi(Z)$| . | |$n$| . | IQM . | . | . | ICM . | . | . | PICM . | . | . | PPL . |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | IMSE . | Conv . | MC . | IMSE . | Conv . | MC . | IMSE . | Conv . | MC . | IMSE . |
| 0 | |$Z$| | 100 | 213(2) | 88 | 100 | 747(1) | 91 | 75 | 152(0) | 95 | 100 | 81(0) |
| 213(1) | 88 | 113(3) | 86 | 152(0) | 95 | 81(0) | ||||||
| 500 | 31(134) | 89 | 100 | 63(15) | 89 | 81 | 31(0) | 99 | 100 | 20(0) | ||
| 31(43) | 89 | 28(140) | 87 | 31(0) | 99 | 20(0) | ||||||
| 1000 | 17(827) | 89 | 100 | 26(65) | 90 | 87 | 18(2) | 99 | 100 | 17(1) | ||
| 17(505) | 89 | 17(1279) | 87 | 18(1) | 99 | 17(1) | ||||||
| |$Z^{1/2}$| | 100 | 267(1) | 89 | 100 | 1008(0) | 91 | 82 | 129(0) | 93 | 100 | 80(0) | |
| 267(1) | 89 | 100(3) | 84 | 129(0) | 93 | 80(0) | ||||||
| 500 | 31(145) | 89 | 100 | 91(11) | 89 | 90 | 28(0) | 98 | 100 | 18(0) | ||
| 31(44) | 89 | 25(105) | 87 | 28(0) | 98 | 18(0) | ||||||
| 1000 | 16(1078) | 90 | 100 | 39(54) | 90 | 95 | 15(1) | 99 | 100 | 14(1) | ||
| 16(684) | 90 | 15(1032) | 89 | 15(1) | 99 | 14(1) | ||||||
| |$Z^{2}$| | 100 | 170(1) | 87 | 100 | 329(0) | 91 | 61 | 161(0) | 97 | 100 | 63(0) | |
| 170(1) | 87 | 112(3) | 87 | 161(0) | 97 | 63(0) | ||||||
| 500 | 30(145) | 88 | 100 | 50(14) | 89 | 63 | 31(0) | 99 | 100 | 18(0) | ||
| 30(45) | 88 | 28(315) | 84 | 31(0) | 99 | 18(0) | ||||||
| 1000 | 16(1190) | 88 | 100 | 23(61) | 89 | 66 | 18(1) | 100 | 100 | 17(1) | ||
| 16(573) | 88 | 17(1932) | 78 | 18(1) | 100 | 17(1) | ||||||
| 30 | |$Z$| | 100 | 252(1) | 89 | 100 | 410(0) | 91 | 68 | 165(0) | 99 | 100 | 28(0) |
| 252(0) | 89 | 126(1) | 90 | 165(0) | 99 | 28(0) | ||||||
| 500 | 36(48) | 89 | 100 | 36(8) | 89 | 80 | 39(0) | 100 | 100 | 28(0) | ||
| 36(18) | 89 | 36(118) | 88 | 39(0) | 100 | 28(0) | ||||||
| 1000 | 21(290) | 90 | 100 | 21(39) | 91 | 84 | 22(1) | 100 | 100 | 10(1) | ||
| 21(172) | 90 | 21(561) | 88 | 22(1) | 100 | 10(1) | ||||||
| |$Z^{1/2}$| | 100 | 203(1) | 89 | 100 | 316(0) | 91 | 78 | 141(0) | 98 | 100 | 32(0) | |
| 250(0) | 89 | 113(2) | 89 | 141(0) | 98 | 32(0) | ||||||
| 500 | 32(39) | 89 | 100 | 35(6) | 90 | 90 | 35(0) | 100 | 100 | 27(0) | ||
| 32(19) | 89 | 32(74) | 89 | 35(0) | 100 | 27(0) | ||||||
| 1000 | 18(328) | 90 | 100 | 19(31) | 92 | 95 | 19(1) | 100 | 100 | 10(1) | ||
| 18(238) | 90 | 19(727) | 91 | 19(1) | 100 | 10(1) | ||||||
| |$Z^{2}$| | 100 | 224(1) | 88 | 100 | 264(0) | 91 | 55 | 174(0) | 99 | 100 | 35(0) | |
| 224(0) | 88 | 125(2) | 89 | 174(0) | 99 | 35(0) | ||||||
| 500 | 35(39) | 88 | 100 | 35(6) | 89 | 60 | 38(0) | 100 | 100 | 31(0) | ||
| 35(20) | 88 | 35(118) | 84 | 38(0) | 100 | 31(0) | ||||||
| 1000 | 20(302) | 88 | 100 | 21(32) | 90 | 63 | 21(1) | 100 | 100 | 10(1) | ||
| 20(237) | 88 | 21(905) | 82 | 21(1) | 100 | 10(1) |
Cens, censoring percentage; IQM, iterative quadratic programming; ICM, iterative convex minorant algorithm; PICM, pseudo-iterative convex minorant algorithm; PPL, pseudo partial likelihood; IMSE, integrated mean squared error; Conv, convergence percentage; MC, matched case percentage.
5. HIV data
The Breastfeeding, Antiretroviral and Nutrition study was a randomized trial conducted in Liongwe, Malawi (Jamieson et al., 2012). A total of 2369 pairs of HIV-infected breastfeeding mothers and their uninfected infants were randomized to one of three groups: 849 to a maternal antiretroviral regimen, 852 to daily infant nevirapine, and 668 to standard of care as the control. A primary endpoint of the trial was HIV transmission to the infant. Infants were scheduled to be tested for HIV every few weeks up to 48 weeks after birth. By 48 weeks there were 76, 62, and 74 infants observed to be HIV-infected in the maternal antiretroviral, nevirapine, and control arms, respectively. The Breastfeeding, Antiretroviral and Nutrition study measured mothers’ CD4 count in cells per |$\mbox{mm}^3$| at baseline, which has been shown to be an important predictor of mother-to-child transmission. Lower CD4 counts are indicative of a weakened immune system and typically are associated with higher levels of virus in HIV-infected individuals, so it is reasonable to assume that the hazard of transmission of HIV from mother to infant decreases monotonically as a function of CD4 count. A standard Cox model with CD4 included in the linear predictor showed a decreasing hazard in the CD4 count, having estimated hazard ratio of 0|$\cdot$|864 for a 100-unit increase in CD4, with |$P<0\cdot01$|, adjusted for the group effect, an estimated hazard ratio of 0|$\cdot$|769 for antiretroviral versus control, and an estimated hazard ratio of 0|$\cdot$|620 for nevirapine versus control. The extent to which the effect of CD4 is adequately captured by a simple proportional hazards model is unclear.
We fit the isotonic model assuming only a monotone relationship between the hazard and CD4 count. The model was fitted using the pseudo-iterative convex minorant algorithm in § 3.2 with anchor constraint |$K=500$|. Choosing |$K=200$|, |$K=300$|, or |$K=1000$| yielded the same results, except that the iterative convex minorant algorithm did not converge when |$K=200$|. Standard Cox models were also fitted using polynomials of order one, two, and three for CD4. Treatment group indicators were included in all models using a linear term as in (10).
Figure 1 displays the hazard ratios based on the estimated isotonic and polynomial functions. The isotonic partial likelihood estimator does not have jumps between the CD4 counts of 1100 and 2000 because the corresponding 23 infants are all censored. Historically, individuals with HIV have been started on antiretroviral treatment when their CD4 count dipped below some cut-off between 200 and 500 cells per |$\mbox{mm}^3$| because this range would correspond to higher viral load and greater infectiousness. The isotonic estimator provides a clear picture of this phenomenon, with a rapid decrease in risk occurring up to a CD4 count of 500 followed by a gradual levelling off for larger counts. After adjusting for the monotone effect on CD4 count, the estimated hazard ratio is 0|$\cdot$|755 for antiretroviral versus control, and 0|$\cdot$|618 for nevirapine versus control, similar to the standard proportional hazards analysis.
The Breastfeeding, Antiretroviral and Nutrition study: estimated hazard ratios based on the isotonic partial likelihood estimator (black solid) and standard proportional hazards models with polynomials of order one (grey solid), two (dashed) and three (dot-dashed). The reference group corresponds to CD4 count equal to 500. The circles indicate HIV infections.
The Breastfeeding, Antiretroviral and Nutrition study: estimated hazard ratios based on the isotonic partial likelihood estimator (black solid) and standard proportional hazards models with polynomials of order one (grey solid), two (dashed) and three (dot-dashed). The reference group corresponds to CD4 count equal to 500. The circles indicate HIV infections.
Apart from the cubic model, the polynomial models do not fit well compared to the isotonic estimator over this range. This is further supported by the goodness-of-fit statistic calculated by stratifying individuals by CD4 quantiles (Parzen & Lipsitz, 1999), where the goodness-of-fit statistic has an asymptotic chi-squared distribution with three degrees of freedom when the model is correctly specified. The cubic polynomial and isotonic models have smaller goodness-of-fit statistics, 0|$\cdot$|1 for cubic and 0|$\cdot$|4 for isotonic, than the simpler models do, 4|$\cdot$|5 for linear and 2|$\cdot$|3 for quadratic, where smaller goodness-of-fit statistics indicate a better model fit, with all |$p$|-values greater than |$0.2$|. The cubic term is not significant, with |$p=0\cdot11$|, and the estimated hazard ratio inexplicably increases between 500 and 1000; for instance, an infant whose mother has a CD4 count of 900 has a |$1\cdot108$| times higher risk of HIV infection than an infant whose mother has a CD4 count of 600. This demonstrates that simple parametric models may not adequately capture the nonlinear effect of CD4 count on mother-to-child HIV transmission. An alternative to using low-dimensional polynomials could entail higher-dimensional parametric models, e.g., splines. Both approaches would require adding a monotonicity constraint to preclude results like those for the cubic polynomial model here, which is not covered by existing results and would require further research.
6. Discussion
It remains to formally establish the global consistency and asymptotic distribution of the isotonic partial likelihood estimator. It is somewhat unclear how to apply earlier theoretical developments for likelihood-based analyses of isotonic regression models, as the log partial likelihood is not a sum of independent terms because the partial likelihood has a nonseparable structure at each failure time in terms of the |$T_{(i)}$| and |$Z_{(i)}$|. This problem is made more challenging because each term in the partial likelihood involves multiple parameters, with the number of parameters increasing as the sample size increases. This differs from the usual set-up, where each term involves a single parameter, or possibly a small fixed number of parameters. Regarding the asymptotic distribution, a natural conjecture which follows previous work on isotonic estimation is that the isotonic partial likelihood estimator has a |$n^{1/3}$| rate of convergence and that |$n^{1/3}\{\hat{\psi}_{n}(z_{0})-\psi(z_{0})\}$| converges to |$C(z_{0})\mathbb{Z}$|, where |$C(z_{0})$| is constant depending on |$z_{0}$| and |$\mathbb{Z}$| is a Chernoff distribution random variable. Figure 2 shows quantile-quantile plots of the sample quantiles versus theoretical quantiles from Chernoff’s distribution (Groeneboom & Wellner, 2001). The linearity of the plots as the sample size increases supports this conjecture. Future work is needed to rigorously derive the asymptotic properties of the isotonic estimators proposed in this paper. The proposed algorithm is generalizable to other settings involving isotonic regression and may potentially yield computational gains over standard algorithms similar to those observed with the proportional hazards model. This merits further investigation.
Quantile-quantile plots of simulated sample versus Chernoff’s distribution at |$z_{0}=0\cdot1$| with no censoring and |$\psi(z)=z$|: (a) |$n=100$|; (b) |$n=500$|; (c) |$n=1000$|.
Quantile-quantile plots of simulated sample versus Chernoff’s distribution at |$z_{0}=0\cdot1$| with no censoring and |$\psi(z)=z$|: (a) |$n=100$|; (b) |$n=500$|; (c) |$n=1000$|.
Acknowledgement
This project was supported by the Center for Drug Evaluation and Research administered by the Oak Ridge Institute for Science and Education through an agreement between the U.S. Department of Energy and the U.S. Food and Drug Administration, the University of North Carolina Center for AIDS Research and the U.S. Centers for Disease Control and Prevention and the U.S. National Institutes of Health. The content is solely the responsibility of the authors and does not represent the official views of any of the organizations above. The authors thank the Breastfeeding, Antiretroviral and Nutrition investigators for access to the data.
Supplementary material
Supplementary material available at Biometrika online includes the proofs of Theorems 1–6 and Propositions 1 and 2, as well as simulation results with time-dependent covariates.
References


