Abstract

Frailty models are used in survival analysis to model unobserved heterogeneity. They accommodate such heterogeneity by the inclusion of a random term, the frailty, which is assumed to multiply the hazard of a subject (individual frailty) or the hazards of all subjects in a cluster (shared frailty). Typically, the frailty term is assumed to be constant over time. This is a restrictive assumption and extensions to allow for time-varying or dynamic frailties are of interest. In this paper, we extend the auto-correlated frailty models of Henderson and Shimakura and of Fiocco, Putter and van Houwelingen, developed for longitudinal count data and discrete survival data, to continuous survival data. We present a rigorous construction of the frailty processes in continuous time based on compound birth–death processes. When the frailty processes are used as mixtures in models for survival data, we derive the marginal hazards and survival functions and the marginal bivariate survival functions and cross-ratio function. We derive distributional properties of the processes, conditional on observed data, and show how to obtain the maximum likelihood estimators of the parameters of the model using a (stochastic) expectation-maximization algorithm. The methods are applied to a publicly available data set.

1. Introduction

In many fields of application of event history analysis, such as medicine and life sciences, demography, social sciences and economy, models for survival data have been developed that explicitly account for unobserved heterogeneity. The seminal paper by Vaupel and others (1979) coined the term “frailty” for a model where the heterogeneity was introduced by a random multiplication factor.

The original formulation of the frailty model by Vaupel and others (1979) specified the hazard rate of a subject as |$h(t \,|\, Z) = Z h_{\rm c}(t)$|⁠, where |$h_{\rm c}(t)$| is the conditional hazard of a subject with frailty value 1, and |$Z$| is a random multiplicative factor, the frailty, which was assumed to follow a gamma distribution. Since part of the frailty can be absorbed in |$h_{\rm c}$|⁠, for identifiability it is typically assumed that the expectation of |$Z$| equals 1. The gamma distribution is mathematically convenient, because the posterior distribution of |$Z$| given survival data will again follow a gamma distribution. As a biological model, the validity of the sgamma frailty model has been questioned, because it implies that the cross-ratio function (Clayton, 1978; Oakes, 1989) and the heterogeneity of survivors (Hougaard, 2000; Farrington and others, 2012) are constant over time, but as a simplified model it has its merits. There is, however, room for extension to more realistic frailty models, for instance, to more general families of frailty distributions (Hougaard, 1986, 2000), correlated frailties (Yashin and others, 1995; Wienke, 2011), or time-varying or dynamic frailties.

Earlier attempts to develop dynamic frailty models include hazard models based on diffusion processes (Woodbury and Manton, 1977; Yashin and Manton, 1997; Aalen and Gjessing, 2004), and on Lévy-type processes (Gjessing and others, 2003). Paik and others (1994) and Wintrebert and others (2004) developed models based on piecewise constant frailties. McGilchrist and Yau (1996) and Yau and McGilchrist (1998) developed an auto-regressive log-normal frailty model and Henderson and Shimakura (2003) and Fiocco and others (2009) an auto-correlated gamma frailty model. A recent paper of Unkel and others (2014) considers time-varying frailty models, proposed in Farrington and others (2012), in the context of paired serological survey data.

In the present paper, we develop an extension of the auto-correlated gamma frailty model of Fiocco and others (2009) to continuous time survival data. A rigorous construction of the underlying frailty process is provided that at the same time generalizes both the correlation structure of the frailty process over time and the class of underlying frailty distributions. The construction is related to that of Gjessing and others (2003), the essential difference being that the Lévy-type processes are constructed as birth–death processes in two dimensions rather than in one dimension. Section 2 outlines the construction of the processes and gives properties of the frailty process. Section 3 derives properties of survival models using such frailty processes as underlying random effects. Results for univariate and bivariate survival data are derived and it is shown how the distribution of the frailty process and its underlying building blocks are affected by event history. Section 4 discusses estimation methods for the underlying parameters of the frailty process and the conditional survival model. We conclude with a simulation study in Section 5, an application in Section 6, and with a discussion in Section 7.

2. Construction and properties

2.1. Compound birth–death processes

2.1.1. Construction

The idea underlying the construction of the frailty process is to view the time-dependent frailty, |$Z(t)$|⁠, as being composed of many independent frailty components |$X(u,v)$|⁠, each contributing only if |$u \leq t$| and |$v \geq t$|⁠. In the spirit of birth–death processes, we think of the |$X(u,v)$|'s as “being born” at |$u \leq t$| and “dying” at |$v \geq t$|⁠. To make these ideas concrete, let |$\mathbf {N} = N(u,v)$| be a Poisson (point) process on the half plane  
\[A = \{(u,v);\ -\infty \leq u \leq v \leq \infty \},\]
with intensity measure |$\mu (u,v)$| satisfying  
\[\nu (t) = \int _{A(t)} \mu (u,v)\,{\rm d}u \,{\rm d}v \lt \infty ,\quad A(t) = \{(u,v) \in A; u \leq t \leq v\},\]
for all |$t$|⁠. Let |$X(u,v)$| be independent positive random variables (marks) defined in the “jumps” of |$N(u,v)$|⁠, whose distribution may depend on |$(u,v)$|⁠, but not on the other points of the Poisson process |$\mathbf {N}$|⁠. Denote the density of |$X(u,v)$| by |$f(x;u,v)$|⁠. Then by the Marking theorem (Kingman, 1993, Section 5.2) |$\mathbf {N}^* = (\mathbf {N},X)$| is again a Poisson process on |$A \times \mathbb {R}^{+ }$| with intensity measure |$\mu ^* $| given by |$\mu ^* (C) = \int _{(u,v,x) \in C} \mu (u,v) f(x;u,v) \,{\rm d}u \,{\rm d}v \,{\rm d}x$|⁠. By Campbell's Theorem (Kingman, 1993, Equation (5.10)), it follows that for every set |$\tilde {A} \subset A$| 
\[\begin {array}{rl} \mathbf {E} \,{\rm e}^{-c \int _{\tilde {A}} X(u,v) \,{\rm d}N(u,v)} &= \exp \left \{- \int _{\tilde {A}} \mu (u,v) \int _0^\infty (1 - {\rm e}^{-cx}) f(x;u,v) \,{\rm d}x \,{\rm d}u \,{\rm d}v \right \}\\ &= \exp \left \{- \int _{\tilde {A}} \mu (u,v) (1-\mathcal {L}_{X(u,v)}(c)) \,{\rm d}u \,{\rm d}v \right \}, \end {array}\]
(2.1)
where |$\mathcal {L}_{X(u,v)}(c)$| is the Laplace transform of |$X(u,v)$| (which may depend on |$(u,v)$|⁠), provided |$\int _{A \times \mathbb {R}^{+ }} \min (|x|,1) \mu (u,v) f(x;u,v) \,{\rm d}u \,{\rm d}v \,{\rm d}x \lt \infty$|⁠. Define  
\[Z(t) = \int _{A(t)} X(u,v)\,{\rm d}N(u,v).\]
(2.2)
From now on we shall assume that the distribution of |$X(u,v)$| does not depend on |$(u,v)$|⁠. If that distribution has Laplace transform |$\mathbf {E}[\exp (-cX)]=\mathcal {L}_X(c)$|⁠, then (2.1) implies that the Laplace transform of |$Z(t)$| is given by  
\[\mathcal {L}(c;t) = \mathbf {E}\exp (-cZ(t)) = \exp (-\nu (t)(1-\mathcal {L}_X(c))).\]
The marginal distribution of |$Z(t)$| has the properties  
\[\mathbf {E} Z(t) = \nu (t) \mathbf {E} X,\quad {{\rm var}}\, Z(t) = \nu (t) \mathbf {E} X^2.\]
The marginal distribution is stationary if |$\nu (t)$| is constant, |$\nu (t) \equiv \nu$|⁠.

2.1.2. Independent increments

Define, for |$u \leq v$|⁠,  
\[B(u,v)=\int _{u^\prime \leq u, v^\prime \geq v} X(u^\prime ,v^\prime ) \,{\rm d}N(u^\prime ,v^\prime ).\]
Since |$(\mathbf {N},X)$| is a Poisson process, |$B(u,v)$|⁠, |$B(u+ \Delta _u, v) - B(u,v)$|⁠, |$B(u, v-\Delta _v) - B(u,v)$| and  
\[\Delta B(u,v) = \int _{u \leq u^\prime \leq u + \Delta u} \int _{v - \Delta v \leq v^\prime \leq v} X(u^\prime ,v^\prime ) \,{\rm d}N(u^\prime ,v^\prime )\]
are all independent of each other, showing that |$B$| has independent increments.
By taking the limits |$\Delta _u \downarrow 0$|⁠, |$\Delta _v \downarrow 0$|⁠, |$B$| can also be used to write  
\[Z(t) = \int _{A(t)} {\rm d}B(u,v).\]

2.1.3. Finite-dimensional distribution

It follows from Campbell's Theorem that the joint Laplace transform |$\mathcal {L}_{Z(\cdot )}(c(\cdot ))$| of the process |$Z(\cdot )$| is given by  
\[\mathbf {E}\exp \left ( - \int c(t) Z(t) \,{\rm d} t \right ) =\exp \left [ - \int _A \left \{1 - \mathcal {L}_X\left ( \int _u^v c(t) \,{\rm d} t\right ) \right \}\mu (u,v) \,{\rm d}u \,{\rm d}v\right ] .\]
Similarly, it can be shown that the joint Laplace transform of |$(Z(t_1),\ldots ,Z(t_m))$| with |$0 \lt t_1 \lt \cdots \lt t_m \lt \infty$| is given by  
\[\mathbf {E} \,{\rm e}^{-c_1 Z(t_1) - \cdots - c_m Z(t_m)} = \exp \left \{- \sum _{1 \leq j \leq k \leq m} (1 - \mathcal {L}_X(c_j + \cdots +c_k)) \cdot \mu (A_{jk}) \right \},\]
with the disjoint sets |$A_{jk}$| defined as  
\[A_{jk} = \{(u,v) \in A;\ t_{j-1} \lt u \leq t_j, t_k \leq v \lt t_{k+1} \},\quad 1 \leq j \leq k \leq m,\]
(2.3)
and |$t_0 = -\infty$| and |$t_{m+1} = \infty$|⁠. From the joint Laplace transform of |$(Z(t_1),Z(t_2))$| it follows that |${{\rm cov}}(Z(t_1),Z(t_2)) = \mu (A_{12}) \mathbf {E} X^2$|⁠, and  
\[{{\rm cor}}(Z(t_1),Z(t_2)) = \mu (A_{12})/(\nu (t_1) \nu (t_2))^{1/2}.\]
(2.4)

Heuristically, coming back to the definition of |$Z(t)$| being composed of many independent frailty components |$X(u,v)$| that are born at |$u \leq t$| and die at |$v \geq t$|⁠: for |$t_1 \leq t_2$|⁠, those that are born at |$u \leq t_1$| and die at |$v \geq t_2$| are common to |$Z(t_1)$| and |$Z(t_2)$|⁠. This common part, |$A_{12}$|⁠, induces positive correlation; the correlation will be greater if the relative contribution of the mass |$\mu (A_{12})$| of the common part is bigger.

2.2. Infinitely divisible frailties

The compound birth–death processes have two random components: the Poisson process with intensity |$\mu$| on |$A$|⁠, governing the location of the points, and the marks |$X(u, v)$| at these locations. |$Z(t)$| is the sum of all the marks in the points of the Poisson process in |$A(t)$|⁠. If the Poisson process has no jumps in |$A(t_1, t_2) = \{u \leq t_1 \lt t_2 \leq v\}$|⁠, then |$Z(t) \equiv 0$| on |$[t_1, t_2]$|⁠. This happens with probability |${\rm e}^{-\mu (A(t_1,t_2))}$|⁠. To avoid this undesirable behavior, we increase the intensity |$\mu$|⁠, at the same time decreasing the size of the jumps. Simply taking |$\mu _\kappa = \kappa \mu$| and |$X_\kappa = X/\kappa$| will result in |$\mathbf {E} Z(t)$| remaining unchanged, but also in |${{\rm var}}(Z(t)) \to 0$|⁠. Our solution is to choose the |$X$|'s from an infinitely divisible distribution, which will keep |${{\rm var}}(Z(t))$| unchanged as well, provided the |$X$|'s are appropriately rescaled. To make this more precise, we consider |$X(u,v)$|'s from an infinitely divisible distribution with Laplace transform given by |$\mathcal {L}(c) = \exp (-\alpha \psi (c;\beta ))$| for a parameter |$\alpha >0$| and a function |$\psi : \mathbb {R} \to \mathbb {R}^+ $|⁠, parametrized by |$\beta$|⁠. Let the corresponding density be denoted by |$f(x; \alpha , \beta )$|⁠. Such a class of distributions was considered by Gjessing and others (2003), see also Aalen and others (2008, Chapter 11), and used to define dynamic frailties by Lévy processes. The function |$\psi (c;\beta )$| is called a Laplace exponent. Special cases to be considered later are the gamma distributions, the family of power variance function (PVF) distributions, and the positive stable distributions. Now take |$X_\kappa$| with Laplace transform |$\mathcal {L}_X(c) = \exp (-\alpha /\kappa \cdot \psi (c;\beta ))$| and |$\mu _\kappa (u,v) = \kappa \cdot g(u,v)$|⁠, for some function |$g: A \to \mathbb {R}^{+ }$| satisfying |$\int _{A(t)} g(u,v) \,{\rm d}u \,{\rm d}v = 1$| for every |$t$|⁠. Our class of dynamic frailty models is obtained by letting |$\kappa \rightarrow \infty$|⁠. This will result in the marginal distribution of |$Z(t)$| having the same Laplace transform as the |$X(u,v)$|'s. To see this, note that, by (2.1), and since |$\lim _{\kappa \to \infty } \kappa (1 - \lambda ^{\alpha /\kappa }) = - \alpha \ln (\lambda )$|⁠, the Laplace transform of |$B(u,v)$| is given by  
\[\begin {array}{rl} &\lim _{\kappa \to \infty } \exp \left \{- \kappa \int _{u^\prime \leq u, v^\prime \geq v} g(u^\prime ,v^\prime ) \,{\rm d}u^\prime \,{\rm d}v^\prime \cdot (1 - \exp (-\alpha /\kappa \cdot \psi (c;\beta ))) \right \}\\ &\quad = \exp \left ( -\alpha \int _{u^\prime \leq u, v^\prime \geq v} g(u^\prime ,v^\prime ) \,{\rm d}u^\prime \,{\rm d}v^\prime \cdot \psi (c;\beta ) \right ) , \end {array}\]
so that |$B(u,v)$| belongs to the same family with parameter |$\alpha$| replaced by |$\alpha \int _{u^\prime \leq u, v^\prime \geq v} g(u^\prime ,v^\prime ) \,{\rm d}u^\prime \,{\rm d}v^\prime$|⁠, and |$Z(t)$| has Laplace transform |$\mathcal {L}_{Z(t)}(c) = \exp (-\alpha \cdot \psi (c;\beta ))$|⁠, for every |$t$|⁠. Note that, both for finite |$\kappa$| and for |$\kappa \to \infty$|⁠, |$Z(t)$| will have jumps with positive probability. For each |$\kappa$|⁠, the number of jumps of size |$>s$| in the set |$B \subset A$| has a Poisson distribution with parameter  
\[\kappa \int _s^\infty f(x; \alpha /\kappa , \beta ) \,{\rm d}x \int _B g(u, v) \,{\rm d}u \,{\rm d}v.\]
The number of jumps of |$Z(t)$| in a given interval |$[t_1, t_2]$| can be studied by considering the above taking |$B$| to be |$A(t_1)$|⁠, |$A(t_2)$|⁠, and in particular their intersection and unique parts. The behavior of |$Z(t)$| as |$\kappa \to \infty$| then depends on the behavior of |$\kappa \int _s^\infty f(x; \alpha /\kappa , \beta )$| as |$\kappa \to \infty$|⁠. For the gamma distribution, taking this limit leads to |$\alpha \int _s^\infty ({\exp (-\beta x)}/{x}) \,{\rm d}x$|⁠. Supplementary material available at Biostatistics online contains R code visualizing the construction of the process.
The joint Laplace transform of |$Z(\cdot )$| is given by  
\[\mathcal {L}_{Z(\cdot )}(c(\cdot )) = \exp \left \{-\int _A \alpha \psi \left ( \int _u^v c(t) \,{\rm d} t \right ) g(u,v) \,{\rm d}u \,{\rm d}v \right \},\]
(2.5)
and the joint Laplace transform of |$(Z(t_1),\ldots ,Z(t_m))$| for |$0 \lt t_1 \lt \cdots t_m \lt \infty$| by  
\[\mathbf {E} \,{\rm e}^{-c_1 Z(t_1) - \cdots - c_m Z(t_m)} = \exp \left \{-\alpha \sum _{1 \leq j \leq k \leq m} \psi (c_j + \cdots +c_k; \beta ) \zeta _{jk} \right \},\]
(2.6)
with |$\zeta _{jk} = \int _{t_{j-1}}^{t_j} \int _{t_k}^{t_{k+1}} g(u,v) \,{\rm d}u \,{\rm d}v$|⁠.
When |$g$| is invariant, i.e., |$g(u,v) \equiv g(v-u)$|⁠, (2.4) specializes to  
\[\begin {array}{rl} {{\rm cor}}(Z(t_1),Z(t_2)) &= \int _{-\infty }^{t_1} \int _{t_2}^\infty g(v-u) \,{\rm d}u \,{\rm d}v = \bar {\bar {\mathcal {G}}}(t_2-t_1),\\ \bar {\bar {\mathcal {G}}}(t) &= \int _0^\infty \bar {G}(t+u) \,{\rm d}u,\ \bar {G}(u) = \int _0^\infty g(u+v) \,{\rm d}v. \end {array}\]
(2.7)
The requirement |$\int _{A(t)} g(u,v) \,{\rm d}u \,{\rm d}v \equiv 1$| implies |$\bar {\bar {\mathcal {G}}}(0)=1$|⁠. The choice |$g(s) = \lambda ^2 \exp (-\lambda s)$| leads to the auto-regressive correlation structure  
\[{{\rm cor}}(Z(t_1),Z(t_2)) = \exp (-\lambda |t_2-t_1|).\]
Choosing |$g(u,v) \equiv g(v-u)$| in such a way that |$g(s)=0$| for |$s >\Delta$| leads to |${{\rm cor}}(Z(t_1),Z(t_2))=0$| when |$|t_1 - t_2| >\Delta$|⁠. For instance, choosing |$g(s) = 2/\Delta ^2$| for |$0 \leq s \leq \Delta$|⁠, and |$g(s)=0$| for |$s >\Delta$|⁠, leads to |${{\rm cor}}(Z(t_1),Z(t_2)) = (1 - {|t_1 - t_2|}/{\Delta })^2$| for |$|t_1 - t_2| \leq \Delta$| and 0 otherwise.

3. Using the dynamic frailty process as random effect

3.1. Marginal hazard and survival function

Consider a general frailty model for a survival time |$T$|⁠, where the conditional hazard given |$Z(t)$| is specified as  
\[h(t \,|\, Z(t)) = h_{\rm c}(t) Z(t),\]
and where the frailty process is given by (2.2) and (2.5). We assume that the distribution of |$T$|⁠, given |$Z(\cdot )$|⁠, is continuous for almost all |$Z(\cdot )$| (i.e., except possibly outside a set of probability 0). This is equivalent to assuming that |$h_{\rm c}(\cdot ) Z(\cdot )$| is integrable on all finite intervals on |$\mathbb {R}^+ $|⁠.
The marginal survival and hazard function are given by  
\[S(t) = \mathbf {E} \exp \left ( -\int _0^t h_{\rm c}(s) Z(s) \,{\rm d}s \right ) ,\quad h(t) = -S^\prime (t)/S(t) = h_{\rm c}(t) \mathbf {E}[Z(t) \,|\, T \geq t].\]
We first observe that the conditional survival function can be written as  
\[S(t \,|\, Z(\cdot )) = \exp \left ( -\int _0^t h_{\rm c}(s) Z(s) \,{\rm d}s \right ) = \exp \left ( -\int _A \tilde {H}_{\rm c}(u,v;t) \,{\rm d}B(u,v)\right ) ,\]
with |$\tilde {H}_{\rm c}(u,v;t) = \int _{[u,v] \cap [0,t]} h_{\rm c}(s)\,{\rm d}s$|⁠. Table 1 shows how |$\tilde {H}_{\rm c}(u,v;t)$| and its derivative with respect to |$t$| behaves for different values of |$u \leq v$|⁠. As a result, the marginal survival function of |$T$| is given by  
\[\begin {array}{rl} S(t) &= \mathbf {E}\exp \left ( {-}\int _A \tilde {H}_{\rm c}(u,v;t) \,{\rm d}B(u,v) \right ) = \prod _{u \leq v} \mathcal {L}_{{\rm d}B(u,v)}(\tilde {H}_{\rm c}(u,v;t)) \\ &= \exp \left ( -\alpha \int _A g(u,v) \psi (\tilde {H}_{\rm c}(u,v;t);\beta ) \,{\rm d}u \,{\rm d}v \right ) . \end {array}\]
(3.1)
Under the assumption of almost sure continuity of |$S(\cdot \,|\, Z(\cdot ))$|⁠, the marginal survival function is differentiable in all finite open subsets of |$\mathbb {R}^+ $|⁠, but not necessarily at |$t=0$|⁠. |$S(t)$| is differentiable also at |$t=0$|⁠, provided the mean of |$Z(0)$| is finite, which is not the case, for instance, for the positive stable distribution. Note that |$S(t)$|is continuous also at |$t=0$|⁠, by Lebesgue's dominated convergence theorem.
Table 1.

|$\tilde {H}_{\rm c}(u,v;t)$| for different values of |$u \leq v,$| within brackets |$($|one line below|$)$| the derivative with respect to |$t$|

|$v \lt 0$||$0 \leq v \leq t$||$v >t$|
|$u \lt 0$| |$H_{\rm c}(v)$| |$H_{\rm c}(t)$| 
 (0) (0) (⁠|$h_{\rm c}(t)$|⁠
|$0 \leq u \leq t$| NA |$H_{\rm c}(v)-H_{\rm c}(u)$| |$H_{\rm c}(t) - H_{\rm c}(u)$| 
  (0) (⁠|$h_{\rm c}(t)$|⁠
|$u >t$| NA NA 
   (0) 
|$v \lt 0$||$0 \leq v \leq t$||$v >t$|
|$u \lt 0$| |$H_{\rm c}(v)$| |$H_{\rm c}(t)$| 
 (0) (0) (⁠|$h_{\rm c}(t)$|⁠
|$0 \leq u \leq t$| NA |$H_{\rm c}(v)-H_{\rm c}(u)$| |$H_{\rm c}(t) - H_{\rm c}(u)$| 
  (0) (⁠|$h_{\rm c}(t)$|⁠
|$u >t$| NA NA 
   (0) 
Table 1.

|$\tilde {H}_{\rm c}(u,v;t)$| for different values of |$u \leq v,$| within brackets |$($|one line below|$)$| the derivative with respect to |$t$|

|$v \lt 0$||$0 \leq v \leq t$||$v >t$|
|$u \lt 0$| |$H_{\rm c}(v)$| |$H_{\rm c}(t)$| 
 (0) (0) (⁠|$h_{\rm c}(t)$|⁠
|$0 \leq u \leq t$| NA |$H_{\rm c}(v)-H_{\rm c}(u)$| |$H_{\rm c}(t) - H_{\rm c}(u)$| 
  (0) (⁠|$h_{\rm c}(t)$|⁠
|$u >t$| NA NA 
   (0) 
|$v \lt 0$||$0 \leq v \leq t$||$v >t$|
|$u \lt 0$| |$H_{\rm c}(v)$| |$H_{\rm c}(t)$| 
 (0) (0) (⁠|$h_{\rm c}(t)$|⁠
|$0 \leq u \leq t$| NA |$H_{\rm c}(v)-H_{\rm c}(u)$| |$H_{\rm c}(t) - H_{\rm c}(u)$| 
  (0) (⁠|$h_{\rm c}(t)$|⁠
|$u >t$| NA NA 
   (0) 
Upon taking the derivative of |$S(t)$| with respect to |$t$|⁠, we obtain the marginal hazard  
\[h(t) = h_{\rm c}(t) \int _{A(t)} \alpha g(u,v) \psi ^\prime (\tilde {H}_{\rm c}(u,v;t);\beta ) \,{\rm d}u \,{\rm d}v.\]
(3.2)
If |$g(u,v) \equiv g(v-u)$|⁠, using Table 1, the multiplication factor |$\mathbf {E}[Z(t) \,|\, T \geq t]$| in (3.2) can be written as  
\[\alpha \psi ^\prime (H_{\rm c}(t);\beta ) \bar {\bar {\mathcal {G}}}(t) + \alpha \int _0^t \psi ^\prime (H_{\rm c}(t)-H_{\rm c}(t-u);\beta ) \bar {G}(u) \,{\rm d}u,\]
(3.3)
with |$\bar {\bar {\mathcal {G}}}$| and |$\bar {G}$| as defined in (2.7).

3.1.1. Special cases

  • The gamma distribution has |$\psi (c;\beta ) = \ln (\beta +c) - \ln (\beta )$|⁠, with derivative |$\psi ^\prime (\tilde {H}_{\rm c}(u,v;t);\beta ) = (\beta + \tilde {H}_{\rm c}(u,v;t))^{-1}$|⁠, which implies that the marginal hazard equals  
    \[h(t) = h_{\rm c}(t) \int _{A(t)} \frac {\alpha g(u,v)}{\beta + \tilde {H}_{\rm c}(u,v;t)} \,{\rm d}u \,{\rm d}v.\]
    (3.4)
  • The power variance function (PVF) distributions (Hougaard, 2000; Aalen and others, 2008) have |$\psi (c;\beta ) = |1 - ({\beta }/({\beta +c}))^m|$|⁠, for |$\beta >0$| and |$m>-1$|⁠, with |$m \neq 0$|⁠. The marginal hazard is given by  
    \[h(t) = h_{\rm c}(t) \int _{A(t)} \frac {\alpha m}{\beta } g(u,v) \left ( \frac {\beta }{\beta + \tilde {H}_{\rm c}(u,v;t)} \right ) ^{m+1} \,{\rm d}u \,{\rm d}v.\]
    The family of PVF distributions has for instance the gamma distribution as a special case as a limit for |$m \to 0$| and |$\alpha \to \infty$| in such a way that |$\alpha m \to \tilde {\alpha }$|⁠. In that case the above formula for the marginal hazard does not hold and (3.4) is to be used instead.
  • The positive stable distribution (Hougaard, 1986, 2000) has |$\psi (c;\beta ) = c^\beta$|⁠, with |$\beta \in (0,1)$|⁠. The derivative of |$\psi$| is given by |$\psi ^\prime (c;\beta ) = \beta c^{\beta -1}$|⁠, which is not defined at |$c=0$|⁠. Indeed, the positive stable distribution has infinite expectation. The derivative of |$\psi$| is defined, however, at |$c>0$|⁠, so that the conditional expectation |$\mathbf {E}(Z(t) \,|\, T \geq t)$| is defined if |$H_{\rm c}(t)>0$|⁠. The conditional Laplace transform of |${\rm d}B(u,v)$|⁠, given |$T \geq t$| is given by  
    \[\mathcal {L}_{{\rm d}B(u,v) \,|\, T \geq t}(c) = \exp \{-\alpha g(u,v) \cdot ((c+ \tilde {H}_{\rm c}(u,v;t))^\beta - \tilde {H}_{\rm c}(u,v;t)^\beta ) \,{\rm d}u \,{\rm d}v \}.\]
    The marginal hazard is given by  
    \[h(t) = h_{\rm c}(t) \int _{A(t)} \alpha \beta g(u,v) \tilde {H}_{\rm c}(u,v;t)^{\beta -1} \,{\rm d}u \,{\rm d}v.\]

Figure 1 shows the marginal hazard |$h(t)$| for the choice |$h_{\rm c}(t) \equiv 1$| and |$g(u,v) = \lambda ^2 \exp (-\lambda (v-u))$|⁠. The frailties in Figure 1(a)–(c) have been chosen such that |$\mathbf {E} Z(t) \equiv 1$|⁠, |${{\rm var}}(Z(t)) \equiv 0.5, 1, 2,$| and |$\rho = \exp (-\lambda ) = 1, 0.6, 0.2$|⁠. For the gamma distribution, shown in Figure 1(a), this is achieved by choosing |$\alpha = \beta = 2, 1, 0.5$|⁠. For the PVF with |$m=3$| (Figure 1(b)), this is achieved by choosing |$\alpha = \beta /3, \beta = 8, 4, 2$|⁠; for the PVF with |$m=-\frac {1}{2}$|⁠, the inverse Gaussian distribution (Figure 1(c)), by choosing |$\alpha = 2\beta , \beta = 1, \frac {1}{2}, \frac {1}{4}$|⁠. Figure 1(d) shows the marginal hazard for the positive stable distribution. A direct comparison of the positive stable distribution with the others is impossible, since the positive stable does not have finite variance. To still make some comparison with the other distributions (here the gamma distribution) possible, the same values of |$\alpha$| were chosen as for the gamma distribution, and for each value of |$\rho$| (⁠|$\lambda$|⁠), values of |$\beta$| for the positive stable distribution were chosen in such a way that |$\mathbf {E}[Z(t) \,|\, T \geq t]$| of (3.3) matches each of the choices of the gamma distribution at |$t=1$| (⁠|$H(t)=1$|⁠). For some combinations of |$\alpha$| and |$\lambda$| no appropriate |$\beta \in (0,1)$| could be found.

Fig. 1.

Marginal hazards for a selection of frailty distributions, for different values of frailty variance and auto-correlation.

Fig. 1.

Marginal hazards for a selection of frailty distributions, for different values of frailty variance and auto-correlation.

Figure 1 illustrates a number of things. First, in (a), (b), and (c), higher variances of the frailties induce stronger selection effects; the marginal hazard is increasingly dragged down from the conditional hazard (⁠|$\equiv 1$|⁠) as the frailty variance increases. This phenomenon is well known from shared frailty models. The value |$\rho =1$| (⁠|$\lambda =0$|⁠) corresponds to the case where the frailty is time-constant, i.e., |$Z(t) \equiv Z$|⁠. In this case, it is well known that the marginal hazard is given by |$h(t) = \alpha h_{\rm c}(t) \psi ^\prime (H_{\rm c}(t);\beta )$| (Aalen and others, 2008, see, e.g., ][Section 6.2.4). In all cases illustrated—in general if |$\psi ^\prime (t;\beta ) \to 0$| as |$t \to \infty$|—we have |$h_{\rm c}(t) \to 0$| as |$t \to \infty$| for |$\rho =1$|⁠. Interestingly, for |$\rho \lt 1$| this is no longer the case; in the cases illustrated in Figure 1, we have |$h(t) \to \int _0^\infty \lambda \,{\rm e}^{-\lambda u} ({\beta }/({\beta +u})) \,{\rm d}u$| and |$h(t) \to \int _0^\infty \lambda \,{\rm e}^{-\lambda u} ({\beta }/({\beta +u}))^{m+1} \,{\rm d}u$|⁠, as |$t \to \infty$|⁠, for the gamma and PVF(⁠|$m$|⁠) case, respectively. For the positive stable case, |$h(t) \to \alpha \lambda ^{1-\beta } \Gamma (1+ \beta )$|⁠; values of |$\beta$| closer to 1 correspond to smaller selection effects. In each of these cases, it can be shown that |$\lim _{t \to \infty } h(t)$|⁠, as a function of |$\lambda$|⁠, increases, and equals 0 only at |$\lambda =0$|⁠.

The selection by the frailties also has an effect on marginal hazard ratios. Consider two conditional hazards |$h_{\rm c}(t)$| and |$r h_{\rm c}(t)$|⁠, for which the hazard ratio equals |$r$|⁠. Figure 2 illustrates the marginal hazard ratio obtained by dividing the marginal hazard corresponding to |$r h_{\rm c}(t)$| (for |$r=2$|⁠) by the marginal hazard corresponding to |$h_{\rm c}(t)$|⁠, again for |$h_{\rm c}(t) \equiv 1$| and |$g(u,v) = \lambda ^2 \exp (-\lambda (v-u))$|⁠. Two distributions are illustrated, with the same choices as Figure 1, the gamma distribution and the inverse Gaussian distribution. In general the pattern will depend on the exact choice of |$h_{\rm c}$| and |$g(u,v)$|⁠. It is straightforward to see that for the positive stable distribution the marginal hazard ratio equals |$r^\beta$|⁠, irrespective of the choice of |$h_{\rm c}$| and |$g(u,v)$|⁠.

Fig. 2.

Marginal hazard ratios when the conditional hazard ratio equals 2, for the gamma distribution (a) and the PVF distribution with |$m=\frac {1}{2}$| (b); different values of frailty variance and auto-correlation are shown.

Fig. 2.

Marginal hazard ratios when the conditional hazard ratio equals 2, for the gamma distribution (a) and the PVF distribution with |$m=\frac {1}{2}$| (b); different values of frailty variance and auto-correlation are shown.

3.2. Marginal bivariate distribution

Supplementary material available at Biostatistics online (Section A) derives the marginal bivariate distribution induced by the dynamic frailty process and properties of the cross-ratio function.

3.3. Frailty distribution given event history

Supplementary material available at Biostatistics online (Section B) derives the conditional distribution of the |${\rm d}B(u,v)$|'s, and hence of |$Z(t+s)$|⁠, given |$T \geq t$|⁠, and, more generally, given observed survival data from a cluster sharing the same dynamic frailty are given. The latter results are useful for fitting the model using an EM-algorithm, which is the topic of the next section.

4. Fitting the model

4.1. Set-up

The situation considered is of |$G$| groups and |$n_g$| observations in group |$g$|⁠. The observations are denoted as |$(t_{gi},d_{gi},\mathbf {x}_{gi})$|⁠, |$i=1,\ldots ,n_g$|⁠, where |$t_{gi}$| and |$d_{gi}$| are the observed times and status variables (⁠|$1={\rm event}$|⁠, |$0={\rm censored}$|⁠) of subject |$i$| in cluster |$g$|⁠, and |$\mathbf {x}_{gi}$| is a |$p$|-dimensional covariate vector of this subject. The frailty processes in the different groups are |$(Z_1(\cdot ),\ldots ,Z_G(\cdot ))$|⁠. Conditionally given the frailty process |$Z_g(\cdot )$| the observations within cluster |$g$| are assumed independent. The conditional model for individual |$i$| in group |$g$| is  
\[h_{gi}(t|\mathbf {x}_g,Z_g(\cdot )) = Z_g(t) h_0(t) \exp (\mathbf {x}_{gi}^\top \boldsymbol {\gamma }),\]
where |$\boldsymbol {\gamma }$| is a |$p$|-dimensional vector of regression coefficients. The conditional likelihood of the data |$(t_{gi},d_{gi})$| leading to the partial likelihood is given by |$L(h_0,\boldsymbol {\gamma } \,|\, Z_1(\cdot ),\ldots ,Z_G(\cdot )) = \prod _g L_g(h_0,\boldsymbol {\gamma } \,|\, Z_g(\cdot ))$|⁠, with the |$g'$|th contribution to the likelihood given by  
\[\prod _i \left [ \left \{Z_g(t_{gi}) h_0(t_{gi})\exp (\mathbf {x}_{gi}^\top \boldsymbol {\gamma }) \right \}^{d_{gi}} \cdot \exp \left \{- \left ( \int _0^{t_{gi}} Z_g(s) h_0(s) \right ) \exp (\mathbf {x}_{gi}^\top \boldsymbol {\gamma }) \right \}\right ] ,\]
(4.1)
the product being over the subjects |$i$| in cluster |$g$|⁠. The nice feature of the Cox model is that the estimated baseline hazard is concentrated in the observed event times |$(t_1^* ,..,t_L^* )$| with |$L$| denoting the number of events in the whole data set. This implies that  
\[\sum _i \left ( \int _0^{t_{gi}} Z_g(s) h_0(s) \right ) \exp (\mathbf {x}_{gi}^\top \boldsymbol {\gamma }) = \sum _{j=1}^L c_{gj} Z_g(t_j^* ),\]
(4.2)
with  
\[c_{gj} = \sum _{i: t_{gi} \geq t_j^* } h_0(t_j^* ) \exp (\mathbf {x}_{gi}^\top \boldsymbol {\gamma }).\]
(4.3)
The marginal likelihood contribution of cluster |$g$| can be written as  
\[\prod _i \{h_0(t_{gi}) \exp (\mathbf {x}_{gi}^\top \boldsymbol {\gamma }) \}^{d_{gi}} \mathbf {E} \left \{\prod _{i: d_{gi}=1} Z_g(t_{gi}) \exp \left ( -\sum _{j=1}^L c_{gj} Z_g(t_j^* )\right ) \right \}.\]
(4.4)

Section 4.2 will develop an EM-algorithm with two versions of the E-step, to be used depending on the size of the data set, in particular on the number of events in the clusters. The first version is feasible only for clusters with a very limited number of events, while the second version, a Monte Carlo EM algorithm, is feasible for clusters with moderate numbers of events.

4.2. EM algorithm

The approach followed is profile maximum likelihood, where the parameters |$h_0(\cdot )$| and |$\boldsymbol {\gamma }$| of the conditional Cox model are estimated by (Monte Carlo) EM for fixed values of the parameters |$\boldsymbol {\theta } = (\alpha , \beta , \lambda )$| of the frailty process |$Z(\cdot )$|⁠. The profile log-likelihood |$\ell _{{\rm prof}}(\boldsymbol {\theta }) = \sup _{h_0,\boldsymbol {\gamma }} \ell (h_0,\boldsymbol {\gamma },\boldsymbol {\theta })$| is obtained and maximized with respect to |$\boldsymbol {\theta }$|⁠. The same approach was used by Nielsen and others (1992) in the context of the shared gamma frailty model. Two versions of the E-step will be developed, one for clusters with small number of events, the other for clusters for larger number of events.

The proposed EM algorithm for fixed values of |$\alpha$|⁠, |$\beta ,$| and |$\lambda$| takes the |$Z_g(t_j^* )$|'s (and the |${\rm d}B(u,v)$|'s in case of the Monte Carlo E-step) as the unobservable variables. We will start from the logarithm of the conditional likelihood in (4.1). The terms with |$Z_g(t_{gi})^{d_{gi}}$|⁠, and also the joint density of the frailties in the complete data log-likelihood, can be ignored because they do not depend on |$h_0$| or |$\boldsymbol {\gamma }$|⁠. This means that the conditional log-likelihood can be written as |$\ell = \sum _g \ell _g$|⁠, with  
\[\ell _g = \sum _i [d_{gi} \{\ln (h_0(t_{gi})) + \mathbf {x}_{gi}^\top \boldsymbol {\gamma } \}] - \sum _{j=1}^L c_{gj} Z_g(t_j^* ).\]
(4.5)
For the E-step, we thus need |$\mathbf {E}(Z_g(t_j^* ) \,|\, {\rm data})$|⁠, for each of the pooled event times |$t_1^* ,\ldots ,t_L^* $|⁠, where “data” stands for the observed |$\{(t_{gi},d_{gi},\mathbf {x}_{gi}) \}$| of cluster |$g$|⁠.

4.2.1. Exact E-step

Using similar arguments as those leading to (4.4), it can be seen that  
\[\mathbf {E}(Z_g(t_j^* ) \,|\, {\rm data}) = \frac {\mathbf {E} \{Z_g(t_j^* ) \prod _{i: d_{gi}=1} Z_g(t_{gi}) \exp (-\sum _{j=1}^L c_{gj} Z_g(t_j^* ))\}} {\mathbf {E} \{\prod _{i: d_{gi}=1} Z_g(t_{gi}) \exp (-\sum _{j=1}^L c_{gj} Z_g(t_j^* ))\}}.\]
(4.6)

Both numerator and denominator can be expressed in terms of derivatives of the Laplace transform |$\mathcal {L}(c_1,\ldots ,c_L)$| of the finite-dimensional distribution of the frailty process from (2.6). Details are given in supplementary material available at Biostatistics online (Section C).

4.2.2. Monte Carlo E-step

Calculation of the conditional expectation of |$Z_g(t_j^* )$| using the method described above becomes prohibitively slow for |$>$|5–10 events in a cluster. Supplementary material available at Biostatistics online (Section D) describes a stochastic E-step that can be used for moderately large number of events per cluster.

4.2.3. Estimation of the frailty parameters

After convergence of the EM algorithm for fixed values of |$\boldsymbol {\theta } = (\alpha , \beta , \lambda )$|⁠, maximum likelihood estimators |$\hat {h}_0(\boldsymbol {\theta })$| and |$\hat {\boldsymbol {\gamma }}(\boldsymbol {\theta })$| are obtained and the profile log-likelihood |$\ell _{{\rm prof}}(\boldsymbol {\theta }) = \ell (\hat {h}_0(\boldsymbol {\theta }),\hat {\boldsymbol {\gamma }}(\boldsymbol {\theta }),\boldsymbol {\theta })$| can be calculated. Since |$\ell _{{\rm prof}}(\boldsymbol {\theta })$| is a function of a limited number of parameters only it can simply be optimized numerically.

Calculation of the profile log-likelihood, or for that matter, of the marginal log-likelihood in general during the EM algorithm, is not directly possible if the Monte Carlo method is used in the E-step. In this case, the MCMC draws may be used to approximate it, see Supplementary material available at Biostatistics online (Section E).

4.3. Standard errors

Supplementary material available at Biostatistics online (Section F) contains details on obtaining the covariance matrix of the estimators.

5. Simulation

A simulation study was performed to evaluate whether it is practically feasible to fit the dynamic frailty models, and in particular to see to what extent it is possible to retrieve information about the value of |$\rho$| for the auto-regressive correlated gamma process. Details about the set-up and the results can be found in Supplementary material available at Biostatistics online (Section G).

6. Application

We consider a litter-matched tumorigenesis experiment, which is discussed in Mantel and others (1977). The data (rats) are part of the survival package in R. It consists of 50 all-female litters of three rats each. In each of the litters, one (random) is given anti-tumor treatment, while the other two serve as controls and are given placebo. The outcome of the experiment is time in weeks to tumor recurrence. A total of 40 tumor recurrence events are observed during follow-up. There were 23 litters without recurrences, 15 with one recurrence, 11 with two recurrences and 1 with three recurrences. A Cox model without frailty gives an estimated regression coefficient of 0.899, with a standard error (SE) of 0.317, for the treatment effect, with a log partial likelihood of |$-208.8452$|⁠. A shared gamma frailty model yields an estimate of 0.906, with an SE of 0.319 for the treatment effect. The estimated frailty variance is 0.474. The (marginal) log partial likelihood equals |$-208.0774$|⁠, which yields a likelihood ratio statistic of 1.5356, and a |$p$|-value of 0.78.

The purpose of the frailty term is to account for unobserved heterogeneity, caused by possible differences between litters that were not controlled by the experiment and not captured by the covariates in the model. Such unobserved differences could either be time-varying (e.g., temporal changes in experimental conditions) or their effect could be time-dependent. Alternatively, a time-varying frailty could capture violations of the proportional hazards assumption of the treatment that are different from those captured by a time-fixed frailty model (Section 3.1). To explore these issues, we fitted a series of auto-correlated frailty processes, all with |$g(u,v) = \lambda ^2 \exp (-\lambda (v-u))$|⁠, yielding |${{\rm cor}}(Z(s),Z(t)) = \exp (-\lambda |s-t|)$|⁠.

First, a Cox model with auto-correlated gamma frailty process was fitted, i.e., the gamma distribution was used (⁠|$\psi (c;\beta ) = \ln (\beta +c) - \ln (\beta )$|⁠). To ensure unit mean |$\mathbf {E} Z(t) \equiv 1$|⁠, |$\alpha$| and |$\beta$| were restricted to |$\alpha =\beta$|⁠, both equal to the inverse of the variance |${{\rm var}}(Z(t))$|⁠. Figure 3 shows a contour plot of the profile log partial likelihood for varying |$\beta$| and |$\lambda$|⁠. For this gamma correlated frailty model, the profile log partial likelihood was maximized at |$\beta$| is 0.834. The estimated frailty variance, 1.199 (=1/|$\hat {\beta }$|⁠), is now much bigger than in the shared frailty model. The estimated value of |$\lambda$| was 0.0388, and hence the estimated correlation was |$\hat {\rho } = 0.962$|⁠, which (since time is in days) amounts to a yearly correlation of 0.13. The model with correlated gamma frailty gives an estimate of 0.910 of the treatment effect, with an SE of 0.334. The marginal log partial likelihood is |$-207.4655$|⁠. The difference in |$-2* $|log-likelihood, compared with the shared frailty model, is 1.2238, which, with one degree of freedom, would amount to a |$p$|-value of 0.73. Note, however, that the shared frailty model corresponds to |$\rho =1$|⁠, which lies on the boundary of the parameter space. Figure 2 in Supplementary material available at Biostatistics online (Section H) shows the posterior mean gamma frailties for each of the 50 L, at the event time points.

Fig. 3.

Contour plot of the profile log partial likelihood for varying |$\beta$| and |$\lambda$|⁠, for the rat data.

Fig. 3.

Contour plot of the profile log partial likelihood for varying |$\beta$| and |$\lambda$|⁠, for the rat data.

In Table 2, we compare the results of using different underlying frailty distributions. The first is the gamma distribution, discussed above. Two examples from the power variance function (PVF) family, for which |$\psi (c;\beta ) = |1 - ({\beta }/({\beta +c}))^m|$|⁠, are included, with |$m=-0.5$| (the inverse Gaussian distribution) and |$m=1$|⁠. The restrictions on |$\alpha$| and |$\beta$| and the relation with |${{\rm var}}(Z(t))$| are now given by |$\beta = ({m+1})/{{{\rm var}}(Z(t))}$|⁠, and |$\alpha = {\beta }/{|m|}$|⁠. The last distribution is the positive stable distribution, with |$\psi (c;\beta ) = c^\beta$|⁠, for |$\beta \in (0,1)$|⁠. Since the positive stable distribution does not have finite expectation or variance, we chose the restriction |$\alpha \beta =1$|⁠. This parametrization was chosen such that the conditional mean |$\mathbf {E}(Z(t) | T \geq t)$| would be |$\alpha \beta =1$| for a shared frailty at |$t$| for which |$H_{\rm c}(t) = \int _0^t h_{\rm c}(s) \,{\rm d}s = 1$|⁠, and its conditional variance |${{\rm var}}(Z(t) | T \geq t)$| would be |$1-\beta$|⁠. For the positive stable distribution, the variance is not defined, but the estimate of |$\beta$| was 0.896, with a 95% confidence interval of 0.305–0.984. In principle, several PVF models with different values of |$m$| could be fitted, and the one with the highest log partial likelihood could be chosen, but there is very little information on |$m$| in the data. Results for estimates of |$\gamma$| and |$\lambda$| are remarkably consistent across different underlying frailty distributions, with the exception of the positive stable distribution.

Table 2.

Parameter estimates and log partial likelihoods of the auto-correlated frailty model with different underlying frailty distributions, for the rat data

Distribution|$\hat {\gamma }$| (SE)Variance (95% CI)|$\hat {\lambda }$| (95% CI)Log partial likelihood
Gamma 0.910 (0.334) 1.199 (0.205–7.029) 0.039 (0.004–0.356) |$-$|207.4655 
|${\rm PVF}(-0.5)$| 0.921 (0.335) 1.464 (0.166–12.92) 0.039 (0.004–0.373) |$-$|207.5718 
PVF(1) 0.905 (0.334) 1.094 (0.223–5.365) 0.039 (0.004–0.342) |$-$|207.3987 
Positive stable 0.965 (0.345) – 0.023 (0.001–0.589) |$-$|208.2296 
Distribution|$\hat {\gamma }$| (SE)Variance (95% CI)|$\hat {\lambda }$| (95% CI)Log partial likelihood
Gamma 0.910 (0.334) 1.199 (0.205–7.029) 0.039 (0.004–0.356) |$-$|207.4655 
|${\rm PVF}(-0.5)$| 0.921 (0.335) 1.464 (0.166–12.92) 0.039 (0.004–0.373) |$-$|207.5718 
PVF(1) 0.905 (0.334) 1.094 (0.223–5.365) 0.039 (0.004–0.342) |$-$|207.3987 
Positive stable 0.965 (0.345) – 0.023 (0.001–0.589) |$-$|208.2296 
Table 2.

Parameter estimates and log partial likelihoods of the auto-correlated frailty model with different underlying frailty distributions, for the rat data

Distribution|$\hat {\gamma }$| (SE)Variance (95% CI)|$\hat {\lambda }$| (95% CI)Log partial likelihood
Gamma 0.910 (0.334) 1.199 (0.205–7.029) 0.039 (0.004–0.356) |$-$|207.4655 
|${\rm PVF}(-0.5)$| 0.921 (0.335) 1.464 (0.166–12.92) 0.039 (0.004–0.373) |$-$|207.5718 
PVF(1) 0.905 (0.334) 1.094 (0.223–5.365) 0.039 (0.004–0.342) |$-$|207.3987 
Positive stable 0.965 (0.345) – 0.023 (0.001–0.589) |$-$|208.2296 
Distribution|$\hat {\gamma }$| (SE)Variance (95% CI)|$\hat {\lambda }$| (95% CI)Log partial likelihood
Gamma 0.910 (0.334) 1.199 (0.205–7.029) 0.039 (0.004–0.356) |$-$|207.4655 
|${\rm PVF}(-0.5)$| 0.921 (0.335) 1.464 (0.166–12.92) 0.039 (0.004–0.373) |$-$|207.5718 
PVF(1) 0.905 (0.334) 1.094 (0.223–5.365) 0.039 (0.004–0.342) |$-$|207.3987 
Positive stable 0.965 (0.345) – 0.023 (0.001–0.589) |$-$|208.2296 

7. Discussion

In this paper, we have extended the auto-correlated gamma frailty models of Henderson and Shimakura (2003) and Fiocco and others (2009), developed for longitudinal count data and discrete survival data, to continuous survival data. The construction in continuous time of these frailty processes developed in this paper is based on compound birth–death processes and bears resemblance to the construction of gamma processes (Cinlar, 1980) and to the Lévy-type processes of Gjessing and others (2003).

The model in Fiocco and others (2009) was developed for longitudinal count data |$\mathbf {Y} = (Y_1,\ldots ,Y_T)$|⁠, and specified that, conditionally on a vector of gamma distributed gamma random variables |$\mathbf {Z} = (Z_1,\ldots ,Z_T)$|⁠, |$Y_t \,|\, Z_t \sim {\rm Po}(Z_t \mu _t)$|⁠, for |$t=1,\ldots ,T$|⁠, where |$\boldsymbol {\mu } = (\mu _1,\ldots ,\mu _T)$| is a vector of means, possibly depending on covariates. The model can be applied to survival data with piecewise constant (baseline) hazards by exploiting the fact that the log-likelihoods of the longitudinal count model and the piecewise constant hazards model coincide, up to a constant (see Fiocco and others (2009) for an application of the model to survival data). The frailty terms satisfy |$\mathbf {E} Z_t = 1$|⁠, |${{\rm var}}(Z_t) = \sigma ^2$|⁠, and |${{\rm cor}}(Z_s,Z_t) = \rho ^{|s-t|}$|⁠, where |$s,t = 1,\ldots ,T$|⁠. The construction of frailty processes in continuous time makes it possible to apply the auto-correlated gamma frailty model directly to continuous survival data with unspecified baseline hazard. Also, in this paper, a method for obtaining maximum likelihood estimates using EM is developed, while Fiocco and others (2009) used composite likelihood methods, as did Henderson and Shimakura (2003).

The construction presented in this paper extends the auto-correlated gamma frailty model of Fiocco and others (2009) in two more ways. First, a wide class of positive infinitely divisible frailty distributions is allowed, containing the gamma distribution, the power variance family (which in turn contains the inverse Gaussian distribution), and the positive stable distribution as special cases. Second, more general correlation structures can be defined by choosing different functions |$g(u,v)$| in the construction of Section 2.

The frailty processes considered in this paper are stationary in the sense that the distribution of |$Z(t)$| does not vary over |$t$|⁠. The assumption of constant variance may not always be realistic in practice—although one should not confuse the variance of |$Z(t)$| with the variance of |$Z(t)$|among the survivors at |$t$|⁠, the relative frailty variance in the terminology of Farrington and others (2012). Models where |${{\rm var}}(Z(t))$| is allowed to vary with |$t$| may be obtained by removing the restriction |$\nu (t) \equiv \nu$| in the construction of Section 2. More research is needed to determine how useful and manageable these more flexible models are.

Formula (3.2) of the marginal hazard, and its special case (3.4) for the gamma distribution, is reminiscent of the Burr model, describing the marginal hazard |$h(t) = {h_0(t) \exp (\mathbf {x}^\top \boldsymbol {\gamma })}/({1 + \theta ^{-1} H_0(t) \exp (\mathbf {x}^\top \boldsymbol {\gamma })})$| of a shared gamma frailty model with frailty variance |$\theta ^{-1}$|⁠. As proposed in Perperoglou and others (2006) (see also van Houwelingen and Putter, 2012, Section 6.2), as an alternative to frailty models, one could use the so-called relaxed Burr model |$h(t) = {h_0(t) \exp (\mathbf {x}^\top \boldsymbol {\gamma })}/({1 +F(t;\theta ) \exp (\mathbf {x}^\top \boldsymbol {\gamma })})$|⁠, with |$F(t;\theta )$| any nonnegative function, as definition of a model. A richer class of relaxed Burr models could be imagined using (3.4) as starting point, and replacing |$\tilde {H}_{\rm c}(u,v;t)$| by a more general nonnegative function |$F(u,v;t)$|⁠.

We presented an EM-algorithm to estimate the parameters in the auto-correlated frailty process. Restricting attention to the gamma distribution, the formulas for the conditional expectation of the frailty at time |$t_j^* $|⁠, |$\mathbf {E}(Z_g(t_j^* ) \,|\, {\rm data})$|⁠, given the data of cluster |$g$|⁠, resemble those in Petersen and others (1996, e.g., (23)–(25)). Petersen and others (1996) studied estimation in a class of correlated gamma frailty models proposed by Yashin and others (1995), using an EM-algorithm, and showed that the posterior distribution of the frailties given the data is a mixture of independent gamma random variables. The methods developed in this paper allow generalization of the correlated gamma frailties to correlated frailties of random variables within the wider class of infinitely divisible distributions considered in this paper.

Supplementary material

Supplementary Material is available at http://biostatistics.oxfordjournals.org.

Acknowledgement

Conflict of Interest: None declared.

References

Aalen
O. O.
Borgan
Ø.
Gjessing
H. K.
(
2008
).
Survival and Event History Analysis
.
Statistics for Biology and Health
.
New York
:
Springer
.

Aalen
O. O.
Gjessing
H. K.
(
2004
).
Survival models based on the Ornstein-Uhlenbeck process
.
Lifetime Data Analysis
10
,
407
423
.

Cinlar
E.
(
1980
).
On a generalization of gamma processes
.
Journal of Applied Probability
17
,
467
480
.

Clayton
D. G.
(
1978
).
A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence
.
Biometrika
65
,
141
151
.

Farrington
C. P.
Unkel
S.
Anaya-Izquierdo
K.
(
2012
).
The relative frailty variance and shared frailty models
.
Journal of the Royal Statistical Society. Series B (Methodological)
74
,
673
696
.

Fiocco
M.
Putter
H.
van Houwelingen
J. C.
(
2009a
).
Meta-analysis of pairs of survival curves under heterogeneity: A Poisson correlated gamma-frailty approach
.
Statistics in Medicine
28
,
3782
3797
.

Fiocco
M.
Putter
H.
van Houwelingen
J. C.
(
2009b
).
A new serially correlated gamma-frailty process for longitudinal count data
.
Biostatistics
10
,
245
257
.

Gjessing
H. K.
Aalen
O. O.
Hjort
N. L.
(
2003
).
Frailty models based on Lévy processes
.
Advances in Applied Probability
35
,
532
550
.

Henderson
R.
Shimakura
S.
(
2003
).
A serially correlated gamma frailty model for longitudinal count data
.
Biometrika
90
,
355
366
.

Hougaard
P.
(
1986
).
A class of multivariate failure time distributions
.
Biometrika
73
,
671
678
.

Hougaard
P.
(
2000
).
Analysis of Multivariate Survival Data
.
Statistics for Biology and Health
.
New York
:
Springer
.

Kingman
J. F. C.
(
1993
).
Poisson Processes
.
Oxford Studies in Probability
.
United Kingdom
:
Oxford Science Publications: Oxford
.

Mantel
N.
Bohidar
N. R.
Ciminera
J. L.
(
1977
).
Mantel-Haenszel analyses of litter-matched time-to-response data, with modifications for recovery of interlitter information
.
Cancer Research
37
,
3863
3868
.

McGilchrist
C. A.
Yau
K. K. W.
(
1996
).
Survival analysis with time dependent frailty using a longitudinal model
.
Australian Journal of Statistics
38
,
53
60
.

Nielsen
G. G.
Gill
R. D.
Andersen
P. K.
Sørensen
T. I. A.
(
1992
).
A counting process approach to maximum likelihood estimation in frailty models
.
Scandinavian Journal of Statistics
19
,
25
43
.

Oakes
D.
(
1989
).
Bivariate survival models induced by frailties
.
Journal of the American Statistical Association
84
,
487
493
.

Paik
M. C.
Tsai
W.-Y.
Ottman
R.
(
1994
).
Multivariate survival analysis using piecewise gamma frailty
.
Biometrics
50
,
975
988
.

Perperoglou
A.
van Houwelingen
H. C.
Henderson
R.
(
2006
).
A relaxation of the gamma frailty (Burr) model
.
Statistics in Medicine
25
,
4253
4266
.

Petersen
J. H.
Andersen
P. K.
Gill
R. D.
(
1996
).
Variance components models for survival data
.
Statistica Neerlandica
50
,
193
211
.

Unkel
E.
Farrington
C. P.
Whitaker
H. J.
Pebody
R.
(
2014
).
Time varying frailty models and the estimation of heterogeneities in transmission of infectious diseases
.
Journal of the Royal Statistical Society. Series C (Applied Statistics)
63
,
141
158
.

van Houwelingen
J. C.
Putter
H.
(
2012
)
Dynamic Prediction in Clinical Survival Analysis
.
Boca Raton
:
Chapman & Hall
.

Vaupel
J. W.
Manton
K. G.
Stallard
E.
(
1979
).
The impact of heterogeneity in individual frailty on the dynamics of mortality
.
Demography
16
,
439
454
.

Wienke
A.
(
2011
)
Frailty Models in Survival Analysis
.
Boca Raton
:
Chapman & Hall
.

Wintrebert
C. M. A.
Putter
H.
Zwinderman
A. H.
van Houwelingen
J. C.
(
2004
).
Centre-effect on survival after bone marrow transplantation: Application of time-dependent frailty models
.
Biometrical Journal
5
,
512
525
.

Woodbury
M. A.
Manton
K. G.
(
1977
).
A random walk model of human mortality and aging
.
Theoretical Population Biology
11
,
37
48
.

Yashin
A. I.
Manton
K. G.
(
1997
).
Effects of unobserved and partially observed covariate processes on system failure: a review of models and estimation strategies
.
Statistical Science
12
,
20
34
.

Yashin
A. I.
Vaupel
J. W.
Iachine
I. A.
(
1995
).
Correlated individual frailty: an advantageous approach to survival analysis of bivariate data
.
Mathematical Population Studies
5
,
145
159
.

Yau
K. K. W.
McGilchrist
C. A.
(
1998
).
ML and REML estimation in survival analysis with time dependent correlated frailty
.
Statistics in Medicine
17
,
1201
1213
.

Supplementary data