The role of C-reactive protein as a prognostic marker in COVID-19

Abstract Background C-reactive protein (CRP) is a non-specific acute phase reactant elevated in infection or inflammation. Higher levels indicate more severe infection and have been used as an indicator of COVID-19 disease severity. However, the evidence for CRP as a prognostic marker is yet to be determined. The aim of this study is to examine the CRP response in patients hospitalized with COVID-19 and to determine the utility of CRP on admission for predicting inpatient mortality. Methods Data were collected between 27 February and 10 June 2020, incorporating two cohorts: the COPE (COVID-19 in Older People) study of 1564 adult patients with a diagnosis of COVID-19 admitted to 11 hospital sites (test cohort) and a later validation cohort of 271 patients. Admission CRP was investigated, and finite mixture models were fit to assess the likely underlying distribution. Further, different prognostic thresholds of CRP were analysed in a time-to-mortality Cox regression to determine a cut-off. Bootstrapping was used to compare model performance [Harrell’s C statistic and Akaike information criterion (AIC)]. Results The test and validation cohort distribution of CRP was not affected by age, and mixture models indicated a bimodal distribution. A threshold cut-off of CRP ≥40 mg/L performed well to predict mortality (and performed similarly to treating CRP as a linear variable). Conclusions The distributional characteristics of CRP indicated an optimal cut-off of ≥40 mg/L was associated with mortality. This threshold may assist clinicians in using CRP as an early trigger for enhanced observation, treatment decisions and advanced care planning.


Introduction
Elevated levels of serum C-reactive protein (CRP) have been observed in patients with COVID-19 and used to assist with triage, diagnostics and prognostication. 1,2 CRP is a nonspecific acute phase protein that is produced by hepatocytes and elevated in acute infection or inflammation. 3 Secretion begins 4-10 h after an inflammatory insult and peaks at 48 h, with a short half-life of 19 h. Crucially, it may be elevated before a patients' vital signs are affected or leukocytes are raised. 3 The profile of this biomarker has made CRP useful and routinely available in clinical medicine for diagnostics.
CRP can be used to assist with differentiation between viral and bacterial infections, for example, influenza produces a mean CRP level of 25.65 mg/L [95% confidence interval (CI) 18.88 to 32.41 mg/L] versus bacterial pneumonia which produces a mean CRP level of 135.96 mg/L (95% CI 99.38 to 172.54 mg/L). 4 In COVID-19, a CRP level of !4 mg/L has been shown to be useful for triaging suspected cases when comparing polymerase chain reaction (PCR)positive patients versus negative controls who have presented to a fever clinic with respiratory symptoms or a high temperature [odds ratio (OR) 4.75; 95% CI 3.28 to 6.88]. 5 However, debate remains over the utility of CRP as a prognostic marker for patients admitted to hospital with COVID-19. In a recent systematic review, 10 of the 22 included COVID-19 prognostic models treated CRP either as a factor or covariate. 6 Most these studies used CRP with a binary threshold; proposed values to predict inpatient mortality varied from !10 mg/L to !76 mg/L. In addition to a binary threshold, CRP has been examined in a trichotomized model with the two thresholds at !40 mg/L and !100 mg/L. 7 A lower cut-off of !20.44 mg/L was used as a threshold for related lung injury, 8 and >32.5 mg/L was found to offer 80% predictive power for a person needing mechanical ventilation. 9 The studies adjusted for admission CRP as a covariate to account for baseline disease severity have assumed a linear or natural logarithm transformation [Ln( CRP )] relationship with outcome. 10,11 Although using CRP in a continuous manner may offer an improved understanding of the contribution of CRP within each analysis, it does not allow CRP to be used by clinical teams to guide management of patients with COVID- 19. Whilst CRP has been argued as an important marker of disease progression in COVID-19, 6 , its distribution has never been explored to understand whether distinct patterns exist in a heterogeneous population. The use of CRP as a biomarker in COVID-19 may present a quick and accessible tool in clinical management, trigger longer periods of enhanced observation, provide information around likely disease progression and assist with early therapeutic, ventilation and palliative care discussions.
The aim of this study is to examine the distribution of CRP at hospital admission, and objectives are to: (i) assess CRP as a prognostic bimodal or trimodal distribution; (ii) propose and compare the categorization of CRP as a prognostic marker to either a linear or a log-linear measure of CRP.

Methods
Policlinico (369/2020/OSS/AOUMO). Written consent was not required from participants as per ethical review.

Study design
This observational study used two cohorts at different time points to examine the contribution of CRP to clinical outcomes. This study has been reported in accordance with the STROBE statement. 12

Settings
Thirteen hospital sites participated, 12 from the UK and one from Italy. All were acute hospitals directly admitting patients with suspected or confirmed COVID-19.

Participants
Original cohort (cohort 1) Participants in Cohort 1 were included as part of the COPE study (COVID in Older People study) as reported in the paper by Hewitt et al. 13,14 Briefly, this was a European multicentre observational study recruiting 1564 hospitalized adults between 27 February and 28 April 2020 with either SARS-CoV-2 viral polymerase chain reaction (PCR) confirmed disease (95.9%) or clinically diagnosed (4.1%) COVID-19. Any patient aged 18 years or older admitted to the participating hospitals with a diagnosis of COVID-19 was included. The study found frailty was associated with longer hospital stay, and a better predictor of mortality as an inpatient, and at Day 7, than age or comorbidity alone.
Validation cohort (cohort 2) Cohort 2 consisted of an additional 271 patients recruited between 29 April and 10 June 2020 from a combination of six of Cohort 1's hospitals plus two additional recruiting hospitals. All patients were SARS-CoV-2 viral PCR-positive.

Variables
A prognostic threshold for CRP was needed within the COPE protocol (March 2020). The limited literature available early in the pandemic included a case series of 73 patients with COVID-19 presenting with a mean CRP level of 51.4 mg/L [standard deviation (SD) 41.8]. 1 Based on this paper, and proposed by the clinical experience of the authors who delivered acute care, a dichotomous threshold was chosen with <40 mg/L (lower admission CRP), and !40 mg/L (CRP-elevated, indicating increased disease severity 14 ).

Data sources
CRP was measured at hospital admission and transcribed from patients' medical records. There was no attempt to standardize the CRP assay between sites. A standardized case reporting form was used for all hospital sites. Data were transferred to King's College London in anonymous format for statistical analysis.

Graphical data analysis
Using the test cohort, the distribution of CRP was examined graphically and stratified by age. Finite bivariate and trivariate Gaussian mixture models were fit to CRP, representing two and three latent classes, respectively. The theoretical distribution from these models was compared with the empirical data and the threshold between the two and three classes was examined. The normality assumptions were assessed visually.

Statistical analysis
Primary analysis: mixture modelling analysis The empirical data from the test cohort were fit to a Gaussian mixture model with one, two or three components using an expectation-maximization algorithm (to refine the starting values) then maximum likelihood estimation (Stata routine 'fmm'). The models were compared using the Akaike information criterion (AIC) and the thresholds were determined by the posterior probability of belonging to the two or three class models.

Secondary analysis: prognostic modelling analysis
To assess differing thresholds for CRP as a prognostic factor of outcome, a series of mixed-effects multivariable Cox proportional hazards models for time to mortality were fit, in a method consistent with the COPE study primary analysis. 13 The model was adjusted for elevated CRP using a level of !40 mg/L, in addition to: patient age group (<65, 65-79, !80 years old), sex, diabetes (yes/no), hypertension (yes/no), coronary artery disease (yes/no) and kidney disease [estimated glomerular filtration rate (eGFR) <60 ml/min/1.73m 2 ]. Dichotomized thresholds of CRP were compared within a range of 10 mg/L to 100 mg/L in 5-mg/L intervals (!10 mg/L, !15 mg/L, etc). Model performance was evaluated and compared using Harrell's C and the AIC. 15 We compared the dichotomized thresholds against linear CRP and Ln( CRP ) (as CRP is known to be skewed) as benchmarks of performance. This method was chosen as dichotomizing results can lead to a loss of information, resulting in a lower predictive power compared with using a continuous measure. 16 Bootstrapping was used to construct 95% percent confidence intervals for differences in model performance between the best-fitting models. Bootstrapping was stratified by site with 1000 replications for each comparison. A complete case analysis was used in all cases due to negligible missing data (<4%).

Validation cohort (cohort 2)
To provide an indication of whether the original results from Cohort 1 were likely to be replicable to a wider group of patients with COVID-19, the analysis was repeated on an independent validation sample (Cohort 2). Using the validation cohort, two-class and three-class mixture models were estimated using the empirical data without restriction. On evidence of overfitting, to assess the additional benefit of a very elevated category for CRP, the validation cohort was fitted using a three-class mixture model, with the class-two mean fixed using the validation cohort twoclass mixture model mean.

Comparison of the prognostic effect of CRP
Using a mixed-effect multivariable Cox regression, the effect of elevated CRP will be reported using a adjusted hazards ratio (aHR), alongside the respective 95% confidence interval (95% CI), for a linear CRP, Ln( CRP ).

Results
The study included 1835 patients across Cohorts 1 and 2, who were drawn from 12 hospitals in the UK and one from Italy. Of the total study participants, 26.4% (n ¼ 484) died in-hospital, varying between sites from 13.3% to 42.9%. A comparison for those who died in hospital was carried out in Table 1, split into Cohort 1 (n ¼ 1564) and Cohort 2 (n ¼ 271). In Cohort 1, 27.2% died and the median CRP level for those who died was 115 mg/L (interquartile range: 63 mg/L-191 mg/L) compared with 69 mg/L (29 mg/L-140 mg/L) among those who survived. For patients with CRP !40 mg/L, mortality was 31.9% compared with 15.0% for patients with CRP <40 mg/L. Median follow-up time (time to mortality or discharge) was 13 days (6-22 days).
Cohort 2 experienced 21.8% mortality. Among those who died, median CRP level was 86 mg/L (48 mg/L-173.5 mg/L) compared with 53 mg/L (16 mg/L-109 mg/L) among those who survived. For patients with CRP !40 mg/L, mortality was 28.6% compared with 10.4% for patients with CRP <40 mg/L. The median follow-up time (time to death or discharge) was 10 days (5-18 days).

Results of cohort 1 (n 5 1564)
Distribution of CRP On graphical examination of the distribution of Ln( CRP ), it exhibited negative skew, with two 'peaks' suggestive of a bimodal distribution, see Figure 1, Plot (i), and Figure S1, available as Supplementary data at IJE online, Plots (i, ii). The distribution of Ln( CRP ) was observed in age-stratified groups of <65, 65-79, and !80 years. On inspection, there was no difference between the distribution age-stratified or the complete dataset.

Primary analysis: mixture modelling analysis
Following the two suggested peaks in the examination of the Ln( CRP ) distribution, a two-latent class finite mixture model was fitted. It appeared to graphically fit the data when examined against the empirical distribution in Figure 1, Plot (i). This was supported by a comparison with the one-class (or null) model, which displayed a higher AIC (4739 compared with 4524). The simple threshold at which the predicted probability of belonging to a two-class model being greater than one-class was 38 mg/L. This will be implemented as ! 40 mg/L herein, to account for the imprecision of the measurement of CRP and also for ease of recall in a busy clinical setting.
The three-class finite mixture model fit slightly better than the two-class finite mixture model (AIC of 4484), with probability of class-one membership highest between range 0-14 mg/L, class-two between 15-120 mg/L and class-three for values of CRP !120 mg/L, see Figure 1, Plot (iii).
The primary analysis proposed a single optimal threshold of CRP !40 mg/L to indicate elevated CRP.

Secondary analysis: prognostic modelling
The time-to-mortality analysis included 1502 participants (96%) in the complete case population. A cut-off of !65 mg/L appeared to fit best in the sample on all measures (Harrell's C statistic of 0.7068, AIC of 5124) ( Table 2) after fitting different binary categorizations of CRP in a Cox model for time to mortality. Differences in measures of goodness of fit were small, especially between cut-offs in the range of !40 mg/L to !90 mg/L. CRP as a continuous Ln( CRP ) measure performed considerably better (Harrell's C statistic of 0.7157, AIC of 5001) and with little improvement on this using a linear scale (Harrell's C statistic of 0.7040, AIC of 5024). Regarding bootstrapped differences in the measures of goodness of fit between a cut-off of !40 mg/L and the marginally better performing cut-off of !65 mg/L, no difference in performance was  (Table 3). There was evidence that cut-offs of both !40 mg/L and !65 mg/L outperformed a cut-off of !10 mg/L, the upper limit of the normal range for CRP. 17 It should be noted that Ln( CRP ) was the optimal parameterization compared with either !40 mg/L (À135.1 AIC, bootstrapped 95% CI -210.4 to À65.1) or !65 mg/L (À123.5 AIC, bootstrapped 95% CI À197.6 to À55.8).

Results of cohort 2 (n 5 271)
Distribution of CRP Cohort 2 included 271 new patients from eight hospital sites: 85 (31.4%) were fully independent, recruited from two new hospital sites; 186 were pseudo-independent, being newly recruited patients from original hospital sites in Cohort 1. There was no difference in the demographics, comorbidities and distribution of CRP seen in Cohort 2 and Cohort 1 ( Table 1).

Fitting finite mixture models
The empirical distribution of the Cohort 2 Ln( CRP ) appeared, graphically, to have a reasonably similar pattern to Cohort 1, see Figure 1, Plot (ii). The two-class finite mixture model gave a consistent threshold (CRP !41 mg/ L). The unrestricted three-class finite mixture model exhibited likely overfitting to the data on examination of the distributions. Inconclusive evidence for the additional second cut-off was found with the class three distribution entirely contained within class two, with a large variance. There was no additional benefit for fixing the central distribution mean and allowing the mixture proportion to vary, but this can be seen graphically in Figure 1, Plot (iv). The simple threshold between class one and class two was !41 mg/L. The time-to-mortality analysis included 208 of the participants (77%) with complete data. Fitting different binary categorizations of CRP in a Cox model for time to mortality gave a CRP cut-off of !40 mg/L as the best fitting model (Harrell's C statistic of 0.7187, AIC of 424), outperforming the Ln( CRP ) model (Harrell's C statistic of 0.7014, AIC of 427), see Table 2. There was no evidence of difference in performance between cut-offs of !65 mg/L and !40 mg/L, nor between !40 mg/L and Ln( CRP ) on examination of bootstrapped 95% CI in Supplementary

Key results
CRP reasonably followed a bimodal distribution using data from two independent cohorts. There was inconclusive evidence of a trimodal distribution; although the AIC metric suggested it fit better, on graphical examination there appeared to be overfitting. In an analysis of 1835 patients across 13 hospital sites using a binary cut-off for CRP as a prognostic factor of COVID-19, inpatient death appeared to have similar predictive power compared with treating it as a linear or Ln( CRP ). In addition, a cut-off value to indicate disease severity is simpler to use in a clinical setting than a linear predictor. These findings support the use of a simple binary threshold for CRP in daily clinical medicine. These results are well aligned with many published analyses in COVID-19 which have already employed a binary cut-off. 4,[18][19][20] The bimodal distribution of CRP may reflect the presence of a latent class influence. Candidate variables for this latent class may include confounders that were not fully controlled for: chronic inflammatory conditions, genomic variation of the virus, genetic susceptibility of populations or other binary exposures such as Bacillus Calmette-Guérin (BCG) vaccination status. [21][22][23] The association of higher CRP with worse outcomes may be due to the severity of the disease consistent with the 'cytokine storm' theory of COVID-19, where the innate immune system is activated releasing TNF-alpha, IL-6 and IL-1. Elshazli et al. found CRP to be a valid biomarker of death from COVID-19 when examining a range of haematological and immunological markers. IL-6 was found to be most predictive (OR ¼ 13.87) of death, and CRP the next best marker (OR ¼ 7.09). 24 However, IL-6 is not routinely available to clinicians, but being linked to CRP as a trigger for its transcription makes CRP a better candidate tool for front-line hospital usage. 25 In the same Elshazli paper, a threshold level of 38.2 mg/L was demonstrated to have the best sensitivity and specificity, which fits well with our findings; this was also found within a recent Cochrane Diagnostic Test Accuracy review. 26 In addition, an elevated CRP may not be attributable to COVID-19 alone and may represent concomitant pathology such as secondary bacterial pneumonia. Although co-infection is well known in other viral respiratory illnesses, the rate in COVID-19 has been found to be far less, being present in around 5.9% of the general COVID-19 hospital population and 8.1% of those with critical illness. 18 The data presented here support a single threshold, and whilst there was argument for competing cut-offs of !40, !65 or greater, the single cut-off is consistent with other studies. 8,24 In addition, it would be clearer and safer to offer a conservative approach using the lower value of CRP, as a higher threshold may falsely reassure clinicians.
There is a need for simple tests to aid clinical management, as the behaviour of CRP in COVID-19 may provide useful immediate risk stratification as to who may have a poor outcome. The threshold of CRP !40 offered a high negative predictive value, so patients presenting with a low CRP are unlikely to exhibit disease progression, and high sensitivity analysis which might lead to opening discussions with patients and their carers about the possible course of the disease. This may assist with early resource planning around the potential for critical care support, and may help guide rapid safe discharge from acute hospitals. 5 Although the results within this paper give a population-based cut-off, any interpretation and management plan must be made on an individual patient basis, with clinicians using CRP in context of clinical history, examination and investigation and noting that the threshold offered a low positive predictive value. Beyond clinical predictive value, this model may be useful for monitoring the outcomes of treatments, for example in a trial of tocilizumab, CRP monitoring was used as a marker of efficacy. 26

Strengths and limitations
This was a large study that included participants admitted to 13 hospital sites. The demographics, case mix and mortality are similar to other larger studies reported within the UK, increasing the findings' generalizability. 20 We have also shown good replication between the two UK-wide cohorts. However, caution should be given to the threshold reported for CRP, as studies identifying optimal cut-offs may be subject to selection bias and may not be replicable. 27 Using a threshold of !40 offered a high sensitivity and negative predictive value but low positive predictive value. A limitation of this study is that due to the urgent nature of research data collection in a pandemic, disease severity on admission was only assessed using CRP without collection of circulating lymphocytes, interleukin-6, procalcitonin, serum lactate and viral load, all of which may also contribute to disease severity. 28 Interpretation A simple threshold !40 mg/L should be used within clinical practice to guide disease severity and likely disease progression. Future studies should analyse using this simple threshold.

Generalizability
The impact of these findings support the routine assessment of serum CRP as an adjunct in the early diagnosis and assessment of illness severity of hospitalized patients with COVID-19. We recommend that CRP !40 mg/L on admission may indicate an increased risk of disease progression and death, and warrants an enhanced level of discussion and clinical support.

Conclusions
We have demonstrated that CRP follows a bimodal distribution in hospitalized patients with COVID-19. This requires further exploration to discover the latent class effect of unobserved factors influencing the distribution of CRP. A CRP of !40 mg/L on admission to hospital should be seen as a reliable indicator of disease severity and increased risk of death. We recommend clinicians use this cut-off as a prognostic indicator only, in conjunction with an individualized clinical assessment, frailty assessment and incorporating a person's wishes and values, to make early decisions about enhanced observation, critical care support and advanced care planning.

Supplementary data
Supplementary data are available at IJE online.

Data availability
Data are available on request from the corresponding author after submission of a statistical analysis plan, after approval from the COPE Study Investigators.

Funding
This study received no specific funding. The study was partially supported through the NIHR Maudsley Biomedical Research Centre at the South London and Maudsley NHS Foundation Trust in partnership with King's College London (B.C.).