A microbiological and genomic perspective of globally collected Escherichia coli from adults hospitalized with invasive E. coli disease

Abstract Objectives Escherichia coli can cause infections in the urinary tract and in normally sterile body sites leading to invasive E. coli disease (IED), including bacteraemia and sepsis, with older populations at increased risk. We aimed to estimate the theoretical coverage rate by the ExPEC4V and 9V vaccine candidates. In addition, we aimed at better understanding the diversity of E. coli isolates, including their genetic and phenotypic antimicrobial resistance (AMR), sequence types (STs), O-serotypes and the bacterial population structure. Methods Blood and urine culture E. coli isolates (n = 304) were collected from hospitalized patients ≥60 years (n = 238) with IED during a multicentric, observational study across three continents. All isolates were tested for antimicrobial susceptibility, O-serotyped, whole-genome sequenced and bioinformatically analysed. Results A large diversity of STs and of O-serotypes were identified across all centres, with O25b-ST131, O6-ST73 and O1-ST95 being the most prevalent types. A total of 45.4% and 64.7% of all isolates were found to have an O-serotype covered by the ExPEC4V and ExPEC9V vaccine candidates, respectively. The overall frequency of MDR was 37.4% and ST131 was predominant among MDR isolates. Low in-patient genetic variability was observed in cases where multiple isolates were collected from the same patient. Conclusions Our results highlight the predominance of MDR O25b-ST131 E. coli isolates across diverse geographic areas. These findings provide further baseline data on the theoretical coverage of novel vaccines targeting E. coli associated with IED in older adults and their associated AMR levels.


Introduction
Escherichia coli is the most common cause of Gram-negative infections in humans, possessing the ability to infect extraintestinal sites such as the urinary tract as well as normally sterile body sites resulting in invasive E. coli disease (IED) that includes acute infections such as bacteraemia, severe pyelonephritis and sepsis.Although IED affects all age categories, the incidence increases by age with adults aged 60 years or older having an increased risk of developing the disease.2][3] The increase in antimicrobial resistance (AMR) among E. coli strains, such as E. coli O-serotype O25b:H4-ST131 represents a major challenge for the treatment and management of E. coli infections. 4Global morbidity and hospitalization due to IED are substantial and are expected to further increase driven by ageing populations and high overall AMR prevalence. 1 The O-antigen, a component of the surface polysaccharide is the target of the serum's opsonophagocytic killing activity that elicits an immunogenic response in humans.The O-antigen provides a protective mechanism also exploited in other polysaccharidebased bacterial vaccines 5,6 and has been since many years a promising target for vaccine exploration in E. coli.
A four-valent (ExPEC4V) 7 and nine-valent (ExPEC9V) 8 bioconjugate vaccine containing four O-antigen polysaccharides (O-serotypes O1, O2, O6 and O25) and nine O-antigen polysaccharides (O-serotypes O1, O2, O4, O6, O15, O16, O18, O25 and O75), respectively, are assessed in clinical studies towards development of a vaccine for prevention of IED. 9 To estimate the O-serotype distribution of extra-intestinal pathogenic E. coli in patients with IED aged ≥60 years, a multicentre, prospective, observational study EXPECT-2 (NCT04117113) was conducted.Clinical results within this study reported by Doua et al. 10 found that out of patients with IED aged ≥60 years in eight participating hospitals from seven countries, 80.4% had bacteraemic IED and half of infections was community-acquired, with the urinary tract the most common source of infection.Sepsis and septic shock were reported in 72.1% and 10.0% of patients, respectively.Kidney dysfunction was the most common complication (12.9%) and overall in-hospital mortality was 4.6%.
This study is part of the EXPECT-2 research project, complementing the clinical results reported by Doua et al. 10 with a microbiological analysis of the available isolates from 238 patients with IED.The major objective was prediction of the theoretical coverage of vaccine candidates ExPEC4V and ExPEC9V based on E. coli isolates obtained from hospitalized patients aged ≥60 years with a diagnosis of IED.The O-serotype, ST, AMR profile and genomic diversity of the isolates were obtained.

Ethics
This study was conducted in accordance with the Declaration of Helsinki, Good Clinical Practices and applicable regulatory requirements.In countries where no waiver was obtained, participants or their legally acceptable representatives provided their written consent to participate in the study after having been informed about the nature/purpose of the study, and the participation and termination conditions.The protocol, Informed Consent Form, and other relevant documents were approved by an Institutional Review Board/Independent Ethics Committee (IRB/IEC) before the study was initiated.

Trial design and patient inclusion
EXPECT-2 (NCT04117113) was a prospective, observational, multicentre, hospital-based study conducted in eight participating hospitals, including two sites in Japan and one site each in the USA, Canada, France, Germany, Italy and Spain.The study took place between October 2019 and January 2021.The diagnosis of IED was determined by the microbiological confirmation of E. coli from blood, urine or an otherwise sterile body site in the presence of requisite criteria of systemic inflammatory response syndrome (SIRS), SOFA or quick SOFA (qSOFA). 1 Patients were eligible for data collection if they were aged ≥60 years and hospitalized with a clinical diagnosis of IED.For countries where no waiver for informed consent had been obtained before data collection, a signed participation agreement/ICF/IAF allowing data collection and source data verification was obtained in accordance with local requirements.
All isolates from this E. coli dataset were stored in microbanks at −80°C at the local sites until shipment to the central laboratory at the University of Antwerp.

Bacterial identification
Control of viability of the stored isolates was performed by culture at the central laboratory.Identification at species level was confirmed by MALDI-TOF MS (Bruker Daltonics, Bremen, Germany).

Antimicrobial susceptibility testing (AST)
The MIC of 24 antimicrobial agents (Table S2, available as Supplementary data at JAC Online), covering 10 different drug classes, were determined for the full dataset by microdilution method using a commercial panel (Sensititre Gram-negative GN6F AST plate, ThermoFisher Scientific).Reading of MIC values was performed with a semi-automated digital MIC viewing system (Vizion ™ , ThermoFisher) and results were interpreted according to the 2023 (v.13.0)EUCAST criteria. 11MDR was defined as nonsusceptibility to at least one agent in three or more antimicrobial categories. 12

Determination of O-serotypes
The O-serotype was determined as a combination of in silico O-genotyping and agglutination O-serotyping.O-genotyping was conducted based on WGS with identification of unique O-serotype specific sequences of wzy, wzx, wzt and wzm genes following guidelines and using reference sequences 13 and were determined with the O-serotyper v.0.1 (Janssen Vaccines & Prevention, Leiden, the Netherlands). 14In case of negative O-genotyping results or in instances where the in silico result corresponded to an O-serotype belonging to one of the nine following serotypes included in the EXPEC9V vaccine candidate: O1, O2, O4, O6, O15, O16, O18, O25, O75, an O-serotyping agglutination assay (SSI Diagnostica, Hillerød, Denmark) was performed to confirm the O-genotyping result.

Whole-genome sequencing
For Illumina short-read sequencing, genomic DNA was isolated from all isolates using the Lucigen MasterPure Complete DNA and RNA Purification kit, according to the manufacturer's protocol.Sample and library preparation was done using Nextera XT kits (Illumina Inc, San Diego, CA, USA).Libraries per isolate were pooled and adjusted to equal molar quantities.Sequencing was done using MiSeq v.2, 500 cycle, PE 2 × 250 bp (Illumina Inc., USA).Raw sequencing data were quality-assessed using FastQC, cleaned using Trimmomatic v.0.4.2 with default parameters for adapter removal and quality trimming.Contamination (<5%) and completeness (>95%) were confirmed with CheckM v.1.1.2(taxonomy_wf species 'Escherichia coli').
For PacBio long-read sequencing, genomic DNA was isolated from two isolates collected from the same patient (EXP200133 and EXP200135) and selected for long-read sequencing due to major differences in AMR genes and AST, using the Qiagen ® MagAttract ® HMW kit (Qiagen) according to the manufacturer's protocol.Isolated DNA was sheared using Covaris G-tubes to obtain fragments with size distributions around 8-12 kb.Barcoded SMRT bell ™ libraries were prepared with the SMRTbell Template kit.Sequencing was performed using the PacBio Sequel (Pacific Biosciences, CA, USA) with 2 h pre-extension and 10 h movie Invasive E. coli: microbiological/genomic aspects time in V3 SMRTCell (1M).PacBio sequencing data were processed, demultiplexed and assembled using SMRTLink v.10.2.

Availability of WGS data
All sequence data from this study have been deposited under BioProject ID PRJNA1021534.

MLST typing, identification of phylogroups and fimbria typing
In silico MLST based on seven conserved housekeeping genes and assignment to allelic numbers and STs was performed using the Achtman MLST locus/sequence definitions database. 17In silico ClermonTyping method and its associated web-interface, the ClermonTyper v.1.4.0 was used for E. coli phylogroup determination. 18,19In silico Fimbria Typing was done using FimTyper v.1.1. 20

Phylogenetic analysis
Parsnp v.1.7.4 21 was used to create a core genome alignment of the nucleotide sequences from the unique isolate collection.SNPs within the alignment were filtered with Gubbins v.2.4.1 22 and used to generate a phylogenetic tree with RAxML v.8.2.12, 23 using the GTRGAMMA nucleotide substitution model.To obtain statistical support for the phylogeny, 1000 bootstraps were performed.The phylogenetic tree and isolate metadata were visualized with iTOL. 24

IED patients and collection of E. coli isolates
In the study, 304 E. coli isolates were included from a total of 238 unique IED patients, with 195 isolates from blood (64.1%), 106 from urine (35.0%) and three from other sterile body sites (0.9%) (Table S1).A subset of the isolates, comprising one representative E. coli isolate per IED patient, hereafter referred to as the unique isolate collection, included a total of 238 individual isolates with 193 isolated from blood and 45 from urine (Figure 1).For the dedicated analysis of paired isolates, another sub-selection was made consisting of multiple isolates per patient (n = 128 from 62 patients), hereafter called the paired isolate collection.
The E. coli population is genetically diverse, with the different STs presenting separate branches across the overall phylogeny (Figure 3).The dominant STs were found to be globally dispersed and there is no country-specific signature of particular STs.

Prevalence of antimicrobial resistance
Across the unique isolate collection, the highest resistance rates were reported for trimethoprim/sulfonamides (cotrimoxazole) and tetracyclines (31%, each).Substantial rates of resistance were also observed for fluoroquinolones (27.7%), third-generation cephalosporins (17.6%) and aminoglycosides (16.0%).No isolates were resistant to colistin or to carbapenems.At a geographic level, the highest proportion of resistance to third-generation cephalosporins, fluoroquinolones and aminoglycosides was observed in isolates from North America and Japan (Table 1).Eighty-nine isolates were characterized as MDR (37.4%), which were found at all eight participating sites (Table S2).
Forty of the 89 MDR E. coli belong to ST131 (n = 39) or related ST2279 (n = 1, EXP201013).These isolates are also phylogenetically related and all part of the clonal complex (CC) 131 (Figure 3).One MDR isolate from a blood specimen of a patient in Japan belonged to ST2179 (EXP200661), phylogroup B1 and was phylogenetically unrelated to the other MDR isolates.It carried a bla CTX-M-65 gene (an allelic variant belonging to CTX-M group 9) as well as a large variety of other AMR genes (16 genes, nine classes of antimicrobial agents).The ST2179 lineage has been recently reported from E. coli recovered from horses in Brazil, 25 retail meat in Portugal as well as pigs and pet animals in Switzerland.While CTX-M-65-producing ST2179 E. coli had already been reported to occur in humans as commensal organisms, to our knowledge it had not yet been reported in the setting of an invasive bacterial infection in humans.
With regards to ESBL genes, apart from this particular ST2179 strain, all but one E. coli isolates of ST131/ST2279 carried a bla CTX-M-15 gene in association with bla OXA-1 , aac(6')-Ib-cr, catB3 as well as three mutations in the QRDR domain of gyrA/parC (GyrA S83L, GyrA D87N and ParC S80I) that are typically known to be associated with resistance to fluoroquinolones in E. coli.One single ST131 isolate harboured bla CTX-M-14 .
Out of the 89 E. coli isolates that displayed a MDR pattern, 48 (53.9%) were covered by EXPEC4V and 59 (66.3%) were covered by EXPEC9V.By far the most prevalent serotype associated with MDR E. coli was O25b always in association with ST131 that accounted for 40% of all MDR isolates (Tables S4 and S5).

Analysis of paired isolates collected from blood and urine
From 62 patients, more than one isolate (n = 128/304) was collected, which we jointly called the paired isolate collection.Most of these isolates (113 isolates from 55 patients) harbour identical ST, O-serotype, AMR and AST (Table S3).For 37 isolate pairs (59.7%),SNP diversity was limited to ≤10 SNPs.Within-host diversity exceeded 10 SNPs in 25 pairs.One patient carried two O25b-ST131 isolates (EXP200133 and EXP200135) originating from a urine and blood sample, collected on the same day.In the blood isolate, six resistance genes (aac(3)-IIa, bla OXA-1 , bla CTX-M-15 , catB3, dfrA17, aac(6')-Ib-cr) were detected that were not found in the urine isolate.The blood isolate also showed phenotypic resistance to nine additional antibiotics (ampicillin, ampicillin-sulbactam, aztreonam, cefazolin, cefepime, ceftazidime, ceftriaxone, gentamicin, tobramycin) as compared to the urine isolate.Long-read sequencing of both isolates from this patient revealed a plasmid carrying these resistance genes in-between transposases (Tn3 family transposase and Invasive E. coli: microbiological/genomic aspects IS6 family transposase).The full plasmid sequence (GenBank accession no.OR611935) was resolved and found to be 116 336 bp in length (Figure S1).

Discussion
Globally, the increasing burden of invasive disease due to E. coli and the continued emergence of AMR among E. coli strains is of great concern.There is a knowledge gap regarding the global distribution of E. coli ST and O-serotypes associated with IED (including sepsis and bacteraemia), with the exception of two specific studies in the UK 28 and France 29 describing local findings focused on bacteraemia, as well as a collection of retrospective surveillance studies describing global O-serotype epidemiology between 2011 and 2017 30 and genomic data from human commensal E. coli in France between 1980 and 2010. 31We present a recent, global prospective study on E. coli isolates from hospitalized IED patients combining O-typing results with phylogeny, and with a detailed comparison of AMR genotypes as determined with WGS and phenotypes with AST.
E. coli strains belonging to O25b-ST131 present a growing challenge for clinicians due to their frequent association with Arconada Nuin et al.
MDR, their increasing prevalence worldwide and ability to cause localized outbreaks of IED. 30 The proportion of MDR isolates of 37% in our study of which 40% are O25, is comparable to the one reported in Oxfordshire, UK, where 44% (1434 of 3278) of the invasive E. coli isolates were found to be MDR.However, this proportion is much higher than reported by Weerdenburg et al. 30 in a larger global study (with more than 3200 invasive E. coli), where approximately 10% of the isolates were classified as MDR and 60% of MDR isolates were O25.These differences might reflect regional differences, variability in MDR definition and/or antibiotic classes that were tested.Epidemiological data about extra-intestinal E. coli in sub-Saharan Africa is still relatively scarce, but studies in Malawi, 32,33 Tanzania, 34 Uganda, 35 South Africa 36 and the Democratic Republic of Congo 37 show a highly diverse E. coli population ST131 being the predominant ST.
The increasing levels of AMR among Gram-negative bacteria, including E. coli, is a global threat and is one of the main drivers for prophylactic approaches such as vaccine development.A nine-valent vaccine (EXPEC9V) for the prevention of IED is currently being tested in a pivotal vaccine efficacy trial (E.mbrace study; ClinicalTrials.govIdentifier NCT04899336).Our data support that ExPEC9V, as well as the earlier tested EXPEC4V, candidate vaccines cover a substantial number of the E. coli strains collected in this study, with a predicted vaccine coverage of approximately 45% for EXPEC4V and 65% for EXPEC9V.This finding is in line with previous findings from patients of all ages in the UK (46% and 72%; 2008-2018) 28 and results from patients aged >60 years across Europe, North America, Asia-Pacific and South America (47% and 68%; 2011-2017). 30In commensal E. coli isolates from healthy individuals in France, the four O-serotypes covered by EXPEC4V represented nearly one-quarter (24%; 1980-2010) of the isolate collection. 31Previous implementation of the serotype-based pneumococcal vaccine led to an increased incidence of the serotypes that were not targeted by the vaccine. 38,39Pneumococcal vaccines induce immunity that impacts Invasive E. coli: microbiological/genomic aspects pneumococcal nasopharyngeal carriage, thereby applying a selective pressure to switch serotypes to escape the immune response.Unlike Streptococcus pneumoniae, E. coli normally inhabits the intestinal tract, representing a distinct immunological compartment.It remains unknown whether ExPEC4V and ExPEC9V apply the same selective pressure in the gut as pneumococcal vaccines do in the nasopharynx.Stool sample analysis including O-serotype prevalence and general metagenomic profiling revealed no indication of vaccine impact on E. coli prevalence, countering the notion of vaccine-induced serotype replacement. 40Close monitoring via surveillance and additional microbiome studies will be crucial to assess the risk of serotype replacement.Altogether, our findings support the further development and theoretical coverage of O-antigen based vaccines targeting E. coli.However, ongoing and future studies will need to provide evidence whether such vaccines show protective efficacy against the vaccine serotypes and assess the cost-benefit ratio of a vaccine-based prophylactic approach.
One study strength is that in a subset of patients more than one E. coli isolate was obtained from different body sites, allowing for a comparison of paired isolates from the same patient and thus assessment of the within-host diversity of the E. coli isolates both at the genomic and phenotypic levels.These data allowed us to confirm that in most of the cases the collected E. coli strains from blood and urine were genetically identical.However, in a small number of cases (four patients) different strains were observed, indicating a co-infection of multiple strains and highlighting the potential impact of within-host diversity.Alternatively, faecal contamination of the urine sample could lead to detection of an isolate that was not responsible for IED.One blood isolate harboured a plasmid with six resistance genes.Loss of this plasmid in the urine isolate compared to the blood isolate could have occurred during re-isolation and/or storage of the isolates.
Our study is part of the EXPECT-2 research project (ClinicalTrials.gov #NCT04117113) and complements the paper by Doua et al. 10 in which the clinical presentation, characterization of IED by infection acquisition setting and outcome was reported in detail.Overall, the proportion of patients with community-acquired (50%) and healthcare-associated (30%) IED reflects the proportion reported in other similar studies carried out in elderly adult patients in different countries and continents. 41,42ne of the limitations of our study is the potential bias of this collection by including only one enrolling site per country, except for Japan, which might not be representative of the distribution of E. coli isolates at the country level.In addition, the participating centres used their own criteria and protocols for obtaining blood culture collection, sample transport and storage of isolates.Further, a selection bias at the patient level cannot be excluded as the proportion of community-versus healthcare-acquired infection, comorbidities and severity of infection at time of enrolment could possibly have varied at the different sites.Moreover, this study was a prospective study of relatively short duration (between 2019 and 2021).][45] In conclusion, the E. coli isolates causing IED are diverse, with the most frequent being O25b-ST131, a strain with high rates of MDR.Our data showing the distribution of the phylogeny, STs,

Figure 1 .
Figure 1.Overview of isolate collections used for analyses.The full data collection contains 304 E. coli isolates.The unique isolate collection contains 238 isolates whereas the paired isolate collection comprises 128 isolates.

Figure 2 .
Figure 2. Distribution of ST (present in ≥5 isolates) and O-serotypes (present in ≥7 isolates).STs and O-serotypes present in, respectively, five or fewer and seven or fewer isolates are summarized as 'other'.(a) The proportion of STs across hospital sites.(b) The pairwise correlation of STs with Clermont phylogroups.The magnitude of the circles indicates the number of isolates following the legend.(c) The proportion of O-serotypes across hospital sites.D The pairwise correlation of O-serotypes and ST.The magnitude of the circles indicates the number of isolates following the legend.

Figure 3 .
Figure 3.The population structure of extra-intestinal pathogenic E. coli is represented by a midpoint-rooted maximum-likelihood phylogenetic tree of the unique isolate collection (238 isolates), based on 1000 bootstraps.Core genomes of the isolates in this study and the reference genome E. coli K-12 were aligned.The tree is based on 139 932 SNPs.Coloured ranges represent predominant STs found in more than five isolates.Rings 1, 2 and 3 denote the hospital site, vaccine coverage by ExPEC4V and ExPEC9V, and multidrug resistance (MDR).Rings 4, 5 and 6 show resistance to cephalosporins, fluoroquinolones and aminoglycosides.The scale bar indicates the number of SNPs per site in the core genome alignment.The tree was visualized using iTOL.
resistant defined as non-susceptibility to at least one agent in three of more antimicrobial categories.bThe degree of drug resistance was based on susceptibility to representative antibiotics in the following six classes of antimicrobial drugs: aminoglycosides, extended-spectrum penicillins (piperacillin-tazobactam) β-lactam/β-lactamase inhibitors, expanded-spectrum cephalosporins, folate synthesis inhibitor and tetracyclines.Arconada Nuin et al.O-serotypes and AMR profiles of E. coli isolates may provide additional epidemiological and microbiological information and inform the development of multivalent conjugate vaccines that target E. coli O-antigens.

Table 1 .
Prevalence of antibiotic resistance following EUCAST breakpoints