Genome Sequence of an Emerging Salmonella enterica Serovar Infantis and Genomic Comparison with Other S. Infantis Strains

Abstract Salmonella enterica serovar Infantis (S. Infantis) is one of the dominant serovars of the bacterial pathogen S. enterica. In recent years, the number of human infections caused by S. Infantis has been increasing in many countries, and often the emerging population harbors a unique virulence-resistant megaplasmid called plasmid of emerging S. Infantis (pESI). Here, we report the complete gap-free genome sequence of the S. Infantis Israeli emerging clone and compare its chromosome and pESI sequences with other complete S. Infantis genomes. We show a conserved presence of the Salmonella pathogenicity islands 1–6, 9, 11, 12, and CS54 and a common integration of five bacteriophages in the S. Infantis chromosome. In contrast, we found variable presence of additionally three chromosomally integrated phages and eight modular regions in pESI, which contribute to the genetic and phenotypic diversity (including antimicrobial resistance) of this ubiquitous foodborne pathogen.


Introduction
The abundant foodborne pathogen Salmonella enterica (S. enterica) is a Gram-negative, highly diverse bacterium that can infect and colonize a broad array of animal and human hosts. This single bacterial species comprises of >2,600 antigenically distinct serovars that can be classified according to their host-specificity and their occasioned disease (Gal-Mor 2019).
Non-typhoidal serovars (NTS) like Salmonella enterica serovar Typhimurium (S. Typhimurium) or S. enterica serovar Infantis (S. Infantis) are known to possess a wide hostspecificity and are capable of infecting various animal species including reptiles, birds, and mammals. In immunocompetent humans, infection with NTS serovars normally provokes a selflimiting localized inflammation of the terminal ileum and colon, called gastroenteritis. The assessed annual global burden of gastroenteritis caused by NTS infections is 78.7 million incidents, resulting in 59,000 deaths (Havelaar et al. 2015).
Amongst >2,600 S. enterica serovars known to date, S. Infantis is one of the most prevalent serovars worldwide. In the United States, S. Infantis was ranked sixth, in the occurrence hierarchy (Crim et al. 2015) and in the European Union, S. Infantis was rated third in the prevalence order, following serovars Enteritidis and Typhimurium (ECDC 2014). Moreover, in recent years, S. Infantis is the most frequently reported S. enterica serovar from food-producing animals (mainly from the poultry production chain) and various food products in Europe (EFSA 2017).
Latterly, we demonstrated that serovar Infantis is largely associated with infections of infants younger than two years old and adheres better to host cells than the serovar Typhimurium. Nevertheless, in comparison to S. Typhimurium, S. Infantis was shown to be less invasive in humans and causes lower inflammation in the colitis mouse model. These differences were attributed to lower expression levels of the Salmonella pathogenicity island (SPI) 1 genes in S. Infantis compared with S. Typhimurium (Aviv et al. 2019).
In Israel, a rapid and clonal emergence of S. Infantis was reported in 2010, and from 2008 to 2015 S. Infantis was the most dominant serovar isolated from both human and poultry sources (Gal-Mor et al. 2010;Aviv et al. 2014). Noteworthy, the emergence of S. Infantis has been further reported in multiple countries around the world including Germany (Hauser et al. 2012), France, Belgium (Cloeckaert et al. 2007), Hungary (Nogrady et al. 2008), Russia (Bogomazova et al. 2020), Honduras (Liebana et al. 2004), Japan (Shahada et al. 2010), and Australia (Ross and Heuzenroeder 2008), indicating that S. Infantis is a globally emerging serovar and a primary source of poultry infection and human salmonellosis. A recent study addressing the genetic structure of the global S. Infantis population has shown that S. Infantis is a polyphyletic serovar and has evolved in three separate lineages, with specifically one dominant emerging lineage (Gymoese et al. 2019).
Previously, we have reported that the fast and clonal S. Infantis emergence was facilitated by lateral acquisition of a novel virulence-resistance megaplasmid, designated pESI (standing for plasmid of emerging S. Infantis) that contributes to multidrug resistance and enhanced pathogenicity of pESIpositive strains (Gal-Mor et al. 2010;Aviv et al. 2014). We specifically showed that pESI encodes several virulence factors, including the yersiniabactin-iron acquisition system, as well as the Klf and Ipf chaperon-usher fimbriae. Furthermore, this plasmid carries various mobile elements encoding antibiotic and mercury resistance genes and at least three independent toxin/antitoxin systems (MazEF/PemKI, CcdAB, and VagCD) (Aviv et al. 2014(Aviv et al. , 2016(Aviv et al. , 2017. Subsequently, genetically related pESI-like plasmids were also found in additional emergent S. Infantis strains in Spain (Iriarte et al. 2017), Switzerland (Hindermann et al. 2017), Italy (Franco et al. 2015), Hungary (Szmolka et al. 2018), Japan (Yokoyama et al. 2015), the USA (Tate et al. 2017), and Russia (Bogomazova et al. 2020). These findings indicate worldwide dissemination of S. Infantis strains harboring pESI-like megaplasmids that play an important role in the evolution and epidemiology of globally emerging S. Infantis lineages.
Here, we report the complete and gap-free genome sequence of the emerging S. Infantis clone, represented by the 119944 Israel-isolated strain and present genomic analysis and comparison with other complete genomes of this serovars. Our results demonstrate a conserved distribution of 10 SPIs and five chromosomal prophages integrated into the genome of S. Infantis. Furthermore, we define core and variable regions in pESI and highlight the circulation of pESI-like plasmids among globally emerging S. Infantis strains.

Whole-Genome Sequencing
Genomic DNA from S. Infantis strain 119944, as a representative isolate of the emergent S. Infantis population in Israel (Gal-Mor et al. 2010) was isolated using the GenElute Bacterial Genomic DNA Kit (Sigma-Aldrich). Whole-genome sequencing that was performed at the Technion Genomic Center of the Israeli Institute of Technology (Haifa, Israel) has generated 16 Â 10 6 paired-ends shorts reads by an Illumina Genome Analyzer IIx platform (Illumina, Inc.) and 291,510 long reads using a MinION sequencer (Oxford Nanopore Technologies). The average MinION reads length was 11,538 bp and the N50 was 28,095 bp long. The quality of the short Illumina and the long MinIon reads (fastq files) was evaluated using FastQC (version 0.11.5) and NanoPlot tools, respectively.

Genome Assembly
Both the short (Illumina) and long (MinION) reads were combined for hybrid de novo assembly using the Unicycler (version 0.4.8-beta) pipeline (Wick et al. 2017a(Wick et al. , 2017b. Unicycler assembler employed SPAdes (version 3.13) with error correction and automatic selection of k-mer length to produce short reads assembly graph (contigs). In the next step, the miniasm and Racon Unicycler's modules were used for long reads and contigs assembly. The resulting assembly was then polished by pilon (version 1.22). The hybrid assembly of the S. Infantis 119944 genome resulted in two closed scaffolds corresponding to the chromosome (4,725,957 bp) and the pESI plasmid (285,081 bp), while the genome was covered 895Â. The complete S. Infantis 119944 chromosome (accession number CP047881) and pESI (accession number CP047882) assembles were deposited in the NCBI database.

Results and Discussion
To advance better understanding of the global epidemiology and genomics of S. Infantis we applied hybrid assembly while combining short reads from Illumina sequencing together with long reads from MinION platform. This approach allowed determining a complete gap-free genome sequence The DNA sequence of plasmids from six S. Infantis strains that were found to harbor megaplasmids (pFSIS1502916, pFARPER-219, pFSIS1502169, pN55391, pCVM44454, and pCFSAN003307) were compare to the sequence of pESI in 119944 using BRIG. Eight variable regions (R1-R8) that were found to be uniquely present in pESI in 119944 are indicated by the gray boxes. of S. Infantis isolate 119944 that was covered 895Â. The complete genome of S. Infantis 119944 has a 53.2% GC content and composes of one circular 4,725,957 bp chromosome and a 285,081 bp plasmid, which we previously named pESI. Using the NCBI prokaryotic genome annotation pipeline (PGAP), we found that the S. Infantis 119944 genome encodes 4,853 genes, 4,612 proteins, 84 tRNA genes, and har-bors122 pseudogenes.
To account for conserved and unique regions in the 119944 genome, the chromosome and the pESI plasmids were compared with eight S. Infantis complete genomes available at the NCBI database. Supplementary table 1, Supplementary Material online shows the main features and relevant metadata of the compered S. Infantis genomes. The S. Infantis genome size of these nine completely sequenced strains varied between 4,630,342 bp (strain NCTC6703) and 5,089,781 bp (FARPER-219) and harbored between none and two plasmids.
The distribution of SPIs was constant across S. Infantis 119944 and the other compared genomes, and all of them were found to carry intact SPIs-1-6, SPI-9, SPI-12, and CS54 ( fig. 1A and table 1). In addition, all of these S. Infantis genomes harbor a 9 kb shorter version of SPI-11, instead of the 15.7 kb island, known in S. Choleraesuis SC-B67 (Jacobsen et al. 2011). Nonetheless, all of the SPI-11associated virulence genes are present in the S. Infantis SPI-11, including pagC, pagD, msgA, envF, and the T3SS effector gene sopF.
Salmonella enterica serovar Infantis 119944 genome was found to possess eight bacteriophages in its chromosome ( fig. 1A and table 1). Five of which (Burkholderia cenocepacia phage BcepMu; Salmonella Phage 103203 sal5; Cronobacter phage vB CsaM GAP32; Gifsy 1; and Enterobacteria phage P4) were present in all of the compared S. Infantis genomes. In contrast, a 59.2 kb Enterobacteria SfV phage (accession number NC_003444.1, spanning positions 529849-589113) was found only in the 119944 genome. Similarly, the 21.7 kb Escherichia phage pro483 (NC_028943, covering positions 6016-27791) and the 48.4 kb Salmonella phage vB SosS Oslo (NC_001609, integrated between positions 3002249 and 3050660) were found in only a subset of the S. Infantis genomes. These results show that while SPIs distribution is conserved among the tested S. Infantis genomes, bacteriophages repertoire is diverse and contributes significantly to the genetic diversification of S. Infantis strains.
Next, we compared the genetic similarity between the corresponding S. Infantis plasmids. Six out of the eight compared S. Infantis strains were found to harbor megaplasmids with size ranging from 178.2 to 316.1 Mb, whereas two S. Infantis strains (1326/28 and NCTC6703) did not carry any plasmid. Among the six S. Infantis megaplasmids found, only five were actually pESI-related ( fig. 1B), carried by S. Infantis strains that were isolated between 2014 and 2017. These results are consistent with the notion that pESI plasmids are associated with emerging (recent) S. Infantis strains and thus far were not identified in older isolates (Aviv et al. 2014).
Despite very high sequence similarity between pESI-related plasmids that were isolated from different geographically regions, the pESI of strain 119944 was found to contain eight unique regions ranging in size between 166 and 3,079 bp (indicated as unique regions R1-R8 in fig. 1B), which were not present in any of the other pESI-like plasmids included in this cohort. These are modular regions comprising insertion sequences elements, transposases or hypothetical proteins found in various plasmids. Interestingly, instead of region 8 (R8), all other pESI-like plasmids contain a different mobile element (possibly a transposon that carries transposases and IS 6-like insertion sequences), encoding arsenic resistance genes cluster as well as the bla CTX-M-65 gene (coding for an extended-spectrum b lactamases), which are lacking in the 119944 pESI. Another recent study (Gymoese et al. 2019) that have used incomplete genomes of 105 S. Infantis isolates identified 16 strains harboring a conserved pESI-like plasmids of $280-283 kb. None of these plasmids contains the bla CTX-M-1 or bla CTX-M-65 genes as reported in pESI-like ESBL-positive plasmids (Franco et al. 2015;Hindermann et al. 2017;Tate et al. 2017). These differences highlight the modular nature of pESI and its genetic plasticity facilitated by its ability to "mix and match" mobile genetic elements and integrate them into a conserved pESI backbone. Moreover, because the ESBLpositive pESI-like plasmids were isolated from S. Infantis strains in USA (Tate et al. 2017), Peru (Vallejos-Sanchez et al. 2019), Switzerland (Hindermann et al. 2017), and Italy (Franco et al. 2015), it is highly possible that these pESI derivatives are globally disseminated.
In summary, we applied a state-of-the-art hybrid assembly approach and determined a gap-free complete sequence of a S. Infantis emerging clone that harbors the virulenceresistance megaplasmid pESI. By a conservative genomic comparison with other complete S. Infantis genomes, we defined core presence of ten SPIs and five prophages and identified conserved and variable regions in the pESI plasmid. We showed that the genetic and phenotypic diversity (especially antimicrobial resistance) of emerging S. Infantis strains is shaped by a varying repertoire of chromosomal prophages and integration of different mobile genetic elements into a conserved pESI backbone.

Supplementary Material
Supplementary data are available at Genome Biology and Evolution online.