Relationship Between Bacterial Strain Type, Host Biomarkers, and Mortality in Clostridium difficile Infection

Clostridium difficile genotype predicts 14-day mortality in 1893 enzyme immunoassay–positive/culture-positive adults. Excess mortality correlates with genotype-specific changes in biomarkers, strongly implicating inflammatory pathways as a major influence on poor outcome. Polymerase chain reaction ribotype 078/ST 11(clade 5) is associated with high mortality; ongoing surveillance remains essential.

We aimed therefore to investigate whether the genotype of C. difficile clinical isolates from multilocus sequence typing (MLST) was associated with mortality and severity biomarkers using a large population-based database of CDI cases and to explore associations between strain-specific effects on host biomarkers and mortality to provide insights into infection pathogenesis.

METHODS
Oxford University Hospitals (OUH) NHS Trust provides >90% of hospital care and all acute services in Oxfordshire (approximately 600 000 people). It includes 2 large acute teaching hospitals and 1 specialist orthopedic hospital in Oxford and 1 district hospital 35 miles north. The OUH microbiology laboratory tests all stool samples from the county, including those from other healthcare facilities/primary care. From 12 September 2006 to 21 May 2011, all unformed stools submitted for C. difficile toxin testing, positive by enzyme immunoassay (EIA) and with sufficient sample remaining, were routinely cultured and MLST typed [1]. During this period, infection control policy required all inpatients with diarrhea (≥3 unformed stools within 24 hours) to have samples sent for EIA testing and to initiate vancomycin treatment empirically, continuing for 14 days if CDI was confirmed. Additionally, from May 2007, all unformed samples from those aged ≥65 years were routinely EIA tested following UK policy.
C. difficile MLST data were anonymously linked to OUH hospital admissions/discharges, mortality, and laboratory test results from the Infections in Oxfordshire Research Database (IORD) through 21 August 2011 [17]. Admissions to other much smaller regional (including psychiatric/community) hospitals were not included, although samples taken at these locations were identifiable. Rates were calculated using overnight stays defined by the UK KH03 occupancy statistic. IORD has Research Ethics Committee (09/H0606/85) and UK National Information Governance Board (5-07(a)/2009) approval as an anonymized database without individual informed consent.
The primary outcome was 14-day mortality after EIA-based CDI detection in adults aged ≥18 years (excluding repeat EIApositive cases within 14 days; censoring follow-up at 14 days). EIA-negative samples were included as controls (excluding repeat negatives within 14 days and any sample taken after or within 21 days before the first EIA positive). See Supplementary Material for details.
The primary exposure was type of CDI, categorized by EIA/ culture status or C. difficile phylogenetic clade from MLST [1]. CDI-associated MLST STs correlate reasonably closely with ribotype [18] and can be grouped by evolutionary relationships into clades [10]. These clades persist despite homologous recombination and have the same phylogenetic structure with MLST or whole-genome sequences [19], suggesting they may behave more similarly in humans. Adjusted mortality risks in each clade and STs with >20 cases were estimated using Cox models, with robust variance adjustment for multiple episodes per patient [20]. EIA-negative controls comprised the reference category so that risks reflected CDI-attributable mortality. Independent predictors were identified using backward selection with the Akaike information criterion [21], allowing nonlinear effects of continuous factors [22]. Exposures considered were demographics, sample characteristics, previous hospital exposure, and previous healthcare-associated infections (Table 1) (antibiotic exposure not available). The impact of clade on the 15 biomarkers available for >50% cases within −3 to +1 days of sample collection was estimated using normal regression on BoxCox-transformed values. Associations between biomarkers and 14-day mortality were estimated using Cox models with multiple imputation (see Supplementary Material).
Serum sodium was slightly but significantly lower in EIApositive cases vs EIA-negative controls (P = .006) and in clade 2 ( Figure 4F). Although clades 2 and 5 had highest mortality, if anything, sodium was increased in clade 5 CDI (P = .08 vs clade 2), leading to no overall association between differences in sodium and excess mortality risks across the different clades (rho = 0.02). Hemoglobin was significantly lower in EIApositive cases vs EIA-negative controls (P < .0001; Figure 4G), but clade-specific variation was restricted to higher hemoglobin in clade 4 (P = .05), with little association with excess mortality (rho = 0.22). Qualitatively, variation across clades in alanine aminotransferase (ALT), creatinine, estimated glomerular filtration rate [23,24], and serum potassium was similar to hemoglobin (Supplementary Figure 1, I-L). No clear associations were evident for urea or alkaline phosphatase (Supplementary Figure 1N and 1O).
Comparing associations individually for clade 1 STs ( Figure 5) supported the partial surrogacy of differences in neutrophils/WBC (rho = 0.48), CRP (rho = 0.43), and eosinophils (rho = −0.45) for excess mortality risk but suggested a stronger relationship with albumin (rho = −0.47). Lack of association for other biomarker changes remained (eg, sodium rho = 0.06; Figure 5D). ST 44 was an outlier, with significantly lower albumin but similar neutrophils/CRP and mortality risk to EIA-negative controls.
Lastly, we estimated how much of the variation in C. difficile clade-associated mortality risk was related to observed biomarker differences. As expected given large numbers, all biomarkers except ALT independently predicted 14-day mortality in addition to Table 1 factors (Supplementary Table 2). However, association strength varied substantially, with albumin, urea, eosinophils, sodium, and CRP most strongly (and creatinine/estimated glomerular filtration rate most weakly) related to mortality. Adjusting for baseline biomarkers explained 41%, 32%, and 37% of the increased mortality due to clades 1, 2, and 5, respectively ( Figure 3). However, even after adjusting for these biomarker differences across C. difficile clades (Figure 4), significant mortality risk variation by clade remained (P = .03), with significantly higher mortality persisting in clade 2 (PCR ribotype 027) vs clade 1 (P = .01) CDIs.

DISCUSSION
In the largest population-based study of genotype and CDI severity to date, we have exhaustively investigated the relationships between strain types, biomarkers, other risk factors, and mortality. We have demonstrated unequivocally that PCR ribotype 027/NAP1/BI/ST 1 (clade 2) strains have been, and continue to be, associated with greater attributable mortality. This  . Variation in 7 biomarkers at diagnosis according to Clostridium difficile clade and association with mortality. A, Neutrophils (×10 9 /L). B, White cell count (×10 9 /L). C, C-reactive protein (mg/L). D, Eosinophils (×10 9 /L). E, Albumin (g/dL). F, Sodium (mmol/L). G, Hemoglobin (g/dL). For each biomarker, left-hand panels show mean (95% confidence interval) values at sample collection for enzyme immunoassay (EIA)-negative controls vs EIA-positive cases; then subdividing EIA-positive cases into culture-negative, not cultured, and culture-positive cases; then subdividing culture-positive cases by clade and excess risk persists even after adjusting for large differences in severity biomarkers. Further, PCR ribotype 078 (clade 5) CDI has attributable mortality at least as great as PCR ribotype 027/ST 1, in agreement with 1 previous study [13] but in contrast with another [6]. Although PCR ribotype 078/clade 5 strains are currently present at low frequency, prospective For each clade and EIA-positive/culturenegative cases, the right-hand panels plot the standardized adjusted mean difference vs EIA-negative controls from the left-hand panel (on the BoxCoxtransformed scale,±standard error) against the adjusted hazard ratio for mortality vs EIA-negative controls from Table 1. The correlation, ρ, between biomarker and mortality risk excesses was estimated using multivariable random effects meta-analysis (see Supplementary Methods). Diagonal lines show the line of best fit (ie, the best prediction of excess mortality for any given excess in biomarkers compared with EIA-negative controls). If differences in biomarkers across clades completely explained mortality differences (ie, the biomarker was a perfect surrogate for mortality), all the points would lie on the diagonal line. The closer the points are to the diagonal line, the stronger the relationship between biomarker differences and excess mortality risks. Points lying far from the diagonal line indicate a mismatch, either high excess mortality with little difference in biomarkers from EIA-negative controls or vice versa. Abbreviations: CRP, C-reactive protein; cult, culture; EIA, enzyme immunoassay; OUH, Oxford University Hospitals; SE, standard error. Figure 5. Impact of Clostridium difficile clade and individual sequence type (ST) on biomarkers compared with mortality. A, Neutrophils (×10 9 /L). B, C-reactive protein (mg/L). C, Albumin (g/dL). D, Sodium (mmol/L). For clades 2-5 (labelled C2, C3, C4, C5) and each clade 1 ST with >20 isolates, the panels plot the standardized adjusted mean difference vs enzyme immunoassay (EIA)-negative controls (on the BoxCox-transformed scale,±standard error) against the hazard ratio for mortality vs EIA-negative controls, adjusted as in Table 1. The correlation, ρ, between biomarker and mortality risk excesses across STs/clades was estimated using multivariable random effects meta-analysis (see Supplementary Methods). Diagonal lines show the line of best fit (ie, the best prediction of excess mortality for any given excess in biomarkers compared with EIA-negative controls), together with a 95% credibility region indicated by the shaded region. If a biomarker was a perfect surrogate for mortality (ie, differences in biomarkers across STs/clades completely explained mortality differences), all the points would lie on the diagonal line. The closer the points are to the diagonal line, the stronger the relationship between biomarker differences and excess mortality risks. Points lying far from the diagonal line indicate a mismatch, either high excess mortality with little difference in biomarkers from EIA-negative controls or vice versa. All clade 1 STs lying outside the 95% credibility region on any of the 4 panels are labelled on each panel; ST 58, which had high mortality in [6], is also labelled. Abbreviations: CRP, C-reactive protein; cult, culture; EIA, enzyme immunoassay; HR, hazard ratio; SE, standard error; ST, sequence type.
surveillance demonstrates their continued expansion [25]; ongoing monitoring therefore remains essential.
Comprehensive simultaneous characterization of the impact of different C. difficile strains on biomarkers and mortality, not previously described to our knowledge, has enabled us to show that strain-type-specific excess mortality risk correlates most closely with strain-type-specific changes in inflammatory biomarkers. Conceptually the framework behind these analyses is similar to that for assessing surrogacy of intermediate for clinical outcomes (eg, blood pressure for cerebrovascular disease) [26]. Some biomarkers, notably renal-related biomarkers (creatinine, eGFR), were prognostic for mortality but did not vary significantly across CDI cases/controls or clades (ie, were acting independently of CDI). Others were prognostic and differed significantly between CDI cases and EIA-negative controls but not across clades. The most prognostic marker, albumin, fell into this category, possibly because of large variability. Biomarkers in the most interesting group, particularly neutrophils/ WBC, CRP, and eosinophils, were prognostic and demonstrated evidence of partial surrogacy (ie, greater differences in baseline biomarkers between clades translated into greater differences in 14-day mortality). This has 2 consequences: First, quantitative traits like these biomarkers may provide greater power than time-to-event outcomes to detect effects of polymorphisms in genome-wide association studies. Second, surrogate markers indicate causal mechanisms of bacterial pathogenesis and may identify future therapeutic areas for investigation. Our results implicate inflammatory pathways as the major influence on poor outcome after CDI.
Although we found strong associations between strainspecific biomarkers and mortality overall, we also discovered intriguing exceptions that, as exploratory findings, may indicate important areas for future investigation. Specific genotypes within the large, heterogenous clade 1, notably ST 44, had particularly low 14-day mortality in post hoc analyses. Although ST 44 differs by only 1 nucleotide on MLST from ST 10, respective 14-day mortality was 3% and 11%, the latter typical of clade 1 overall (12%). However, both STs are consistently identified as PCR ribotype 015 [10]. They differ by >1500 single nucleotide polymorphisms across the genome [19] and may also differ in their accessory genomes, suggesting possible areas for future study. In contrast, our data suggest ST 49 (PCR ribotype 014) could be a more severe clade 1 genotype; this is an emergent clone in the United Kingdom [25] and should be monitored closely. Another intriguing finding is the major disconnect between the impact of clade 3 CDI on neutrophils/ WBC/CRP and mortality. Similarities between clades 3 and 5 in severity biomarkers might be expected, as the receptorbinding domain of their pathogenicity locus tcdB gene (encoding one of the major known clostridial toxins) is highly genetically similar and their tcdC sequences share the same protein-truncating nucleotide substitution [10]. The latter is phenotypically equivalent to the single nucleotide deletion in the clade 2/PCR ribotype 027 tcdC, which causes a proteintruncating frameshift [10] and possibly leads to hypervirulence through increased toxin expression [27,28] (although recent studies have questioned this [29]). Clades 2, 3, and 5 are also binary toxin positive [10] (in contrast with clades 1 and 4). However, the substantially lower mortality in clade 3 vs clade 5 highlights the importance of other, as yet undetermined, virulence or host factors to clinical outcomes [30] and suggests that increased toxin production alone in PCR ribotype 078 cannot account for its virulence.
Overall, we found 30%-40% of differences in mortality risk between strains were due to differences in biomarkers at diagnosis. However, in contrast with a recent much smaller study [31], even after adjusting for biomarker differences (and other factors) significant mortality differences remained across clades; this suggests that further microbial virulence determinants remain to be identified. Of note, the biomarker-adjusted effects of strain (reported in [31]) adjust away any effect of strain on outcome mediated through biomarkers, effects that we show to be substantial ( Figure 4).
Our study has some limitations. The EIA assay used for case ascertainment has suboptimal sensitivity (91.7% in [32]), similar to other toxin EIAs [32,33]. However, because of widespread concerns about sensitivity, for most of the study (through December 2009), multiple diarrheal samples were submitted from each patient, simultaneously or serially (500-1100 EIA tests performed monthly), reducing the chance of completely missing symptomatic CDI. One consequence is that we almost certainly identified false positives, perhaps explaining some EIA-positive/culture-negative cases [34]. To reduce the impact of false negatives, our controls only included EIAnegative tests >21 days before the first EIA positive result. During the study, there were 9.2 EIA-positive CDIs/10 000 overnight stays in inpatients, compatible with the 3.8-9.5 EIApositive CDIs/10 000 overnight stays typical in endemic settings [35]. Overall, 14-day mortality attributable to EIA-positive CDI was 7.7%, similar to the 8% in a meta-analysis of 10 975 cases from 27 studies after 2000 [36] and 11% in another large study [37], also suggesting generalizability. By necessity, analyses were limited to available electronic data, which did not include previous/concomitant antibiotics, specific comorbid conditions, or causes of death. Although antibiotics are undoubtedly critical for developing CDI, given the lack of impact of adjusting for other important risk factors on strain-mortality associations, it is plausible that further adjustments would have had little further effect. Although theoretically C. difficile-related deaths should provide a more accurate measure of attributable mortality, practically attributing causes is subjective and usually unaudited. In contrast, all-cause mortality is objective, and differences in early mortality between EIA-positive cases vs EIA-negative diarrhea controls should be directly or indirectly CDI related. Although previous studies have considered 30-day mortality [5], reasonable reinfection rates between 14-30 days [38] influenced our prespecified choice of primary outcome. However, strain differences were similar at 30 days, and survival curves were parallel subsequently ( Figure 2).
Our study also has important strengths. First is its comprehensive scope, including cases from an entire region over almost 5 years, including 3 hospitals providing acute services and numerous secondary/primary care providers. Second, it included 1893 EIA-positive/culture-positive strain-typed cases, approximately double the largest previous studies (n = 1008 [5]; n = 715 [13]). Study size becomes increasingly important when exploring differences between strains; 700-800 cases are needed to detect an 8% absolute mortality increase (as observed between clade 1 vs clade 2) with 80% power. Inadequate power therefore likely explains why smaller studies failed to identify associations between PCR ribotype 027 and severe outcomes (eg, n = 123 [7]; n = 128 [39]; n = 236 [40]). We were also able to compare strains at the clade/ST level, whereas most previous studies have only compared 027 vs non-027 strains [5], pooling 4 heterogeneous clades. We were unable to confirm previous reports [6] of poorer outcomes with PCR ribotypes 018 (ST 17 [10]) and 056 (ST 34/58 [10]), although longer-term mortality was similar in clade 4 (PCR ribotype 017/ST 37) and clade 2 (PCR ribotype 027/ST 1) as previously reported [11]. Our data confirm that the lack of the large clostridial toxin A (tcdA) in these clade 4 cases does not lead to less severe outcomes. We did not find any evidence of greater year-on-year mortality reductions in PCR ribotype 027/ST 1 (clade 2) compared with other clades [39], suggesting overall improvements in outcome are more likely due to better patient management than strain effects. The other mortality risk factors we identified broadly agree with previous studies [16], mostly reflecting disease severity or subsequent management; however, unlike previous studies, we have adjusted for the potential confounding due to bacterial type.
In summary, MLST demonstrates that strain predicts mortality and severity biomarkers at both clade and individual sequence-type level. For patient monitoring, neutrophils/WBC, CRP, and albumin are the key C. difficile-associated biomarkers that are highly prognostic for short-term mortality and also partial surrogates (with the possible exception of clade 3). For surveillance, PCR ribotype 078/ST 11 (clade 5) is associated with severe CDI, and its prevalence provides an important context for hospital mortality data [25]. Lastly, our study demonstrates the power from integrating large electronic databases with molecular sequence-based typing. Using whole-genome sequencing, approximately 85% of an approximately 4.3-Mb reference C. difficile genome can be called using standard mapping [19], providing unparalleled resolution to investigate severity determinants compared with the 7.4-kb MLST sequence used here. Unexpected differences in strains appearing highly similar by MLST and in biomarker vs mortality relationships hint at the advances that pathogen whole-genome association studies will provide in our understanding of bacterial pathogenesis over the next decade.

Supplementary Data
Supplementary materials are available at Clinical Infectious Diseases online (http://cid.oxfordjournals.org/). Supplementary materials consist of data provided by the author that are published to benefit the reader. The posted materials are not copyedited. The contents of all supplementary data are the sole responsibility of the authors. Questions or messages regarding errors should be addressed to the author.