Using mixed-effects modeling to estimate decay kinetics of response to SARS-CoV-2 infection

Abstract The duration of natural immunity in response to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a matter of some debate in the literature at present. For example, in a recent publication characterizing SARS-CoV-2 immunity over time, the authors fit pooled longitudinal data, using fitted slopes to infer the duration of SARS-CoV-2 immunity. In fact, such approaches can lead to misleading conclusions as a result of statistical model-fitting artifacts. To exemplify this phenomenon, we reanalyzed one of the markers (pseudovirus neutralizing titer) in the publication, using mixed-effects modeling, a methodology better suited to longitudinal datasets like these. Our findings showed that the half-life was both longer and more variable than reported by the authors. The example selected by us here illustrates the utility of mixed-effects modeling in provide more accurate estimates of the duration and heterogeneity of half-lives of molecular and cellular biomarkers of SARS-CoV-2 immunity.


INTRODUCTION
The calculation of decay rates is a common problem in the natural sciences. Combining direct experimental observation with mathematical modeling of decay kinetics allows for the calculation of half-lives that are many orders of magnitude larger than the observed time period. For example, the element Bismuth, usually thought of as stable, actually has a half-life of (1.9 ± 0.2) × 10 19 years. This finding was made by observing the energetics and rate of alpha-particle decay [1], and mathematically modeling this data, which was gathered over an observation period that was (not surprisingly) far shorter than the published half-life.
A matter of slightly more pressing import at present is the calculation of half-lives of markers of immunity in response to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. A number of reports in the popular press have put forward the idea that it is "too early to tell" how long the immune response to SARS-CoV-2 infection will last [2][3][4], as the observation period for most studies so far has been on the order of several months. However, the decay kinetics of humoral and cellular immunity are Figure 1. Distribution of half-life of pseudovirus neutralizing titer obtained from applying MEM for a single exponential model to the data from [1]. The histogram was generated by combining 1000 simulated MEM parameter realizations per patient in the original dataset. Mean and 95% CI from pooled (blue), "longitudinal" (red) analysis reported in [1] are shown as dot-and-whiskers plots. Green dot and whiskers indicate "population" median and 95% limits of estimated elimination half-lives from MEM (corresponding to the 2.5, 50 and 97.5% quantiles of the histogram).
well-understood processes and tractable to mathematical modeling [5][6][7][8][9]. Thus, careful model-based estimation of the duration of antibody, T-cell and B-cell responses can provide crucial insights into understanding the durability of immunological protection, which in turn will have a significant impact on the course of the pandemic in the coming years.
In this context, we can consider as an example the characterization of the immune response to SARS-CoV-2 infection by Dan et al. [10], published recently in Science. This study provides a comprehensive first look at the kinetics of memory T-and B-cell responses to SARS-CoV-2 and provides a contrast with the kinetics of the antibody response. The authors observe inter-individual heterogeneity in each immunological compartment examined, and comment that " . . . this heterogeneity means that long-term longitudinal studies will be required to precisely define antibody kinetics to SARS-CoV-2." To estimate half-lives, Dan et al. [10] used a parsimonious approach, fitting either a linear or second-order polynomial to the log-transformed data pooled across patients, an approach often referred to as "naïve pooled fitting." These pooled cross-sectional model fits provide a poor description of the data. The reported correlations correspond to average R 2 values of 0.12, 0.18 and 0.06 for antibodies, memory B-cells and memory T-cells, respectively (0.12 for the entire dataset), indicating that on average 12% of the total variability in each dataset is described by the model fit.
Similarly, the p-values (while impressive) are misleading as they are only a test of the hypothesis that the independent variable changes in response to the dependent variable-all they tell us is that immunity changes over the observed time period.
Although parsimony is always desirable in modeling, the approach used by the authors here leads to skewed estimates of both the population mean and the heterogeneity in the duration and half-life of immune protection. Notably, the naïve-pooled approach cannot distinguish population heterogeneity from measurement error. Thus, the uncertainty in half-life estimation by this method cannot be used to draw inferences about population-level variability in protection against SARS-CoV-2 infection. In some cases, the authors did perform a longitudinal analysis, which they described as follows: "simple linear regression was performed, with t 1/2 calculated from log2-transformed data for each pair." However, linear "regression" to two data points is simply a line through those points (with zero residuals between data and model), and as such cannot distinguish between population heterogeneity and measurement error.
Mixed-effects modeling (MEM) is a more suitable methodology for dealing with longitudinal datasets such as these, where multiple correlated measurements are taken from each subject. Such an approach allows for more accurate and precise estimates of population heterogeneity and allows for it to be distinguished from the uncertainty in the population mean [5]. This methodology is standard in drug development and clinical pharmacology for estimating the variability of pharmacokinetic and pharmacodynamic rate constants in longitudinal clinical datasets [5]. Notably, MEM has been used to provide precise estimates of the long-term durability of antibody response after vaccination [6,7] or immunotherapy [8], as well as to describe the interrelation between different immunological markers [9].

METHODS
To isolate the effect of using MEM versus naïve-pooled parameter estimation (rather than embarking on a modelbuilding activity), we used the default Monolix (2020R1, Lixoft) settings and the same single-exponential model used by Dan et al. [1] to fit pseudovirus (PSV) observations (y obs (t)): The parameter y 0 is the estimated PSV titer at symptom onset, e is Euler's number, k el is the PSV titer exponential elimination rate (1/day), t is time in days post-symptom onset, b is the estimated magnitude of proportional error between the model prediction and noisy observations and ǫ is a standard normally distributed random variable.
We performed nonlinear MEM using Monolix Suite 2020R1 (Lixoft). The MEM-formatted dataset were derived from the dataset provided in Supplementary Material S1 from [10]; the Monolix project and model files, input dataset and R code for generating the figures are included in the Supplementary Material S1.
To estimate the loss of sensitivity (LOS) across the population over time, we simply calculated the time for each patient to decay to the limit of sensitivity LOS reported in Dan  Inc.) to generate the empirical distribution of times as well as 90% bootstrapped confidence intervals and overlaid it on the model predictions (Fig. 2).

RESULTS
In the illustrative example here, we have demonstrated the effect of using MEM rather than naïve-pooled parameter estimation to fit a longitudinal dataset from the paper (pseudovirus neutralizing titer of antibody, abbreviated as PSV, after [10]). MEM estimates a median halflife of 100 days (95% population interval: 33-320d), compared with reported means of 27 days using naïvepooled estimation (95% confidence interval (CI): 11-157d) and 90 days using longitudinal analysis (95% CI: 70-125d), as shown in Figures 1E and F in [10]. Thus, the kinetics of PSV response is different than reported, as MEM reveals both a longer median half-life and a greater degree of heterogeneity. The latter difference is to be expected, as the 95% CI in the authors' estimates reflects the uncertainty in the estimation of the mean halflife, whereas the 95% population interval for MEM reflects the heterogeneity present within the patient population (Fig. 1). We used the resulting fit from the MEM to estimate time taken for PSV to return to baseline (defined here as the same limit of sensitivity, or LOS = 20, used in [10]), finding that the median duration of return to LOS for pseudovirus neutralizing titer (PSV) was 332 days, with 95% of the population returning to LOS between 39 and 1237 days, and 56% of the population predicted to return to LOS at 1 year (Fig. 2). Further details (including data and code) may be found in the Supplementary Material S1 or on Github (https://github.com/deanbot1/immunomem).

DISCUSSION
The durability of the immune response to SARS-CoV-2 infection is a key determinant of the trajectory of the current pandemic. In the past few months, numerous reports have provided detailed longitudinal datasets describing this response (summarized in [11]), but aside from one notable exception [11], where linear mixed-effects models were used, most studies characterized their data using inappropriate analysis methodologies such as linear regression [10,[12][13][14][15] and spline fitting [16]. The use of inappropriate analysis methodologies can preclude a precise estimation of duration of immunity, while at the same time obscuring the inter-individual variability by confounding it with experimental error.
As an example, in this report, we have selected one recent paper by Dan et al. [10], which has been the subject of substantial commentary in recent weeks. In the selected paper, the authors conclude that "immune memory in at least three immunological compartments was measurable in ∼95% of subjects 5 to 8 months post-symptom onset, indicating that durable immunity against secondary COVID-19 disease is a possibility in most individuals." The popular press has picked up on this particular scientific report, and it received extensive coverage, with numerous articles inferring from the author's results that "the immune response to SARS-CoV-2 could last for years" [17][18][19] (we note that the conclusion in the popular press is stronger than the language used by the authors to describe their results, but the gist is the same in both cases).
For one marker (PSV) shown here as an example, our MEM analysis suggests a somewhat different interpretation. Although some individuals can expect durable coverage against SARS-CoV-2 reinfection by this marker, PSV for approximately half of those infected by the virus will return to LOS by 12 months. MEM thus allows for a more precise estimate of the durability of the immune response and also highlights the inter-individual variability, partitioning it away from experimental noise. We note that MEM relies on the assumption that the individual parameters follow a prespecified statistical distribution (in this case, lognormal for both parameters).
Using a modeling approach that is better suited to the datasets at hand has practical public health implications. A growing number of cases of reinfection have been directly demonstrated using molecular phylogenetic analyses [20,21], and this functional data is consistent with population heterogeneity in the durability of the immune response. The existence of a subset of individuals who potentially lose immunological protection (and are vulnerable to reinfection) within 12 months is relevant both to the prospects for herd immunity and the public health strategy for this disease.

SUPPLEMENTARY DATA
Supplementary data are available at ABT Online.

AUTHORS' CONTRIBUTIONS
D.B. performed mixed-effects modeling, co-wrote and edited the note. A.C. conceived the analysis, wrote the first draft and edited the note. G.H. and L.F.W. provided input into the methodology and edited the note. M.S. and L.Y. provided input into the methodology and conducted parallel but independent modeling of the data as a cross-check. All authors have read and approved the final manuscript.

FUNDING
None declared.

CONFLICT OF INTEREST
D.B. and G.H. are employees of Takeda Pharmaceuticals. A.C., M.S. and L.Y. are employees and shareholders of Fractal Therapeutics. Neither Fractal Therapeutics nor Takeda Pharmaceuticals has a financial interest in this study or the results described here.

DATA AND MATERIALS AVAILABILITY
Models and code are available in the supplementary materials.