-
PDF
- Split View
-
Views
-
Cite
Cite
Dropen Sheka, Nikolay Alabi, Paul M K Gordon, Oxford nanopore sequencing in clinical microbiology and infection diagnostics, Briefings in Bioinformatics, Volume 22, Issue 5, September 2021, bbaa403, https://doi.org/10.1093/bib/bbaa403
- Share Icon Share
Abstract
Extended turnaround times and large economic costs hinder the usage of currently applied screening methods for bacterial pathogen identification (ID) and antimicrobial susceptibility testing. This review provides an overview of current detection methods and their usage in a clinical setting. Issues of timeliness and cost could soon be circumvented, however, with the emergence of detection methods involving single molecule sequencing technology. In the context of bringing diagnostics closer to the point of care, we examine the current state of Oxford Nanopore Technologies (ONT) products and their interaction with third-party software/databases to assess their capabilities for ID and antimicrobial resistance (AMR) prediction. We outline and discuss a potential diagnostic workflow, enumerating (1) rapid sample prep kits, (2) ONT hardware/software and (3) third-party software and databases to improve the cost, accuracy and turnaround times for ID and AMR. Multiple studies across a range of infection types support that the speed and accuracy of ONT sequencing is now such that established ID and AMR prediction tools can be used on its outputs, and so it can be harnessed for near real time, close to the point-of-care diagnostics in common clinical circumstances.
Introduction
Current state of practice
A number of reasons portent the need for improvement in the accuracy, sensitivity and speed of diagnosing microbial infections in order to improve clinical outcomes. Primarily, a speedy diagnosis is needed when treating bacterial infections, where time is of the essence for positive patient outcomes. For instance, every hour delay in appropriate bloodstream infection-related sepsis treatment decreases survival by 7.6% [1]. Similarly, the long wait times and inaccuracies of tests for infections such as tuberculosis and lyme disease often leave patients to endure the symptoms of their infections for extended periods of time [2, 3]. Prompt recognition is critical to the hastening of treatment plan initiation. A consequence of misdiagnosis is the prescription of antibiotics that are ultimately wasted and cause adverse side effects for patients. Economic impacts of misdiagnosis of bacterial infections have been studied in the context of cellulitis [4]. They found that for a single institution, misdiagnosis of cellulitis costs $195 million to $515 million annually in avoidable health-care spending, exclusive of the costs of the antibiotics themselves and the complications resulting from inappropriate treatment [4]. Another study evaluated the accuracy of diagnosis and the appropriateness of prescribed antimicrobial courses at the Minneapolis Medical Center [5]. They determined that 42% of patients were incorrectly diagnosed of which 95% were prescribed inappropriate antibiotics. Inaccurate diagnosis can lead to potentially harmful antibiotic use for patients. A recent case study detailing the misdiagnosis of melioidosis reported two different patients having to be treated with multiple different courses of antibiotics as a result of inaccurate identification (ID) of the infecting bacteria [6]. Subsequently, one of the patients developed drug-induced hepatitis due to the prescription of an antitubercular medication from misdiagnosis, adding to their existing symptoms. The appropriateness of treatment a patient receives has a profound impact on their quality of life, and thus an accurate diagnosis is necessary.
Technique . | Summary . | ID capability . | Time of fastest technique available for AST (h) . | Associated cost . | Automatic (A) or manual (M) . |
---|---|---|---|---|---|
Culture-based techniques | Bacteria inoculated on agar plates with different concentrations of antibiotics | Limited to metabolic activity of pathogen on substrate | 4.5–18 (MicroScan WalkAway) | Very expensive, exact figures are NA [9] | M |
Molecular-based methods | Utilizes conventional PCR or quantitative real-time PCR to amplify specific sequences of nucleic acids, allowing simultaneous pathogen ID and AST | Limited to recognition of specific predetermined sequences of AMR genes | 1.0 (BioFire) | >$49,000 for BioFire Assay >$100 per test [107] | A |
Spectrometry MALDI-TOF MS | MALDI-TOF MS is based on the rapid ionization of the bacteria/yeast ribosomal proteins using a laser pulse. The calculated mass of the ions is the specific sample fingerprint of the bacterial/yeast species | Yes, but protein profile of each species must be predetermined; poor for similar species | < 5.0—Bruker Daltonics | $0.50/sample and $150 000 for machinery [108] | A |
Spectrometry approaches combined with molecular tools | The PCR/electrospray-mass spectrometry (PCR/ESI-MS) is a fairly recent technology that couples a molecular method to a spectrometry approach | Yes, protein profile of each species must be predetermined | <6.0—Ibis Biosciences | $50–100 per sample and $450 000 for machinery [109] | A |
Technique . | Summary . | ID capability . | Time of fastest technique available for AST (h) . | Associated cost . | Automatic (A) or manual (M) . |
---|---|---|---|---|---|
Culture-based techniques | Bacteria inoculated on agar plates with different concentrations of antibiotics | Limited to metabolic activity of pathogen on substrate | 4.5–18 (MicroScan WalkAway) | Very expensive, exact figures are NA [9] | M |
Molecular-based methods | Utilizes conventional PCR or quantitative real-time PCR to amplify specific sequences of nucleic acids, allowing simultaneous pathogen ID and AST | Limited to recognition of specific predetermined sequences of AMR genes | 1.0 (BioFire) | >$49,000 for BioFire Assay >$100 per test [107] | A |
Spectrometry MALDI-TOF MS | MALDI-TOF MS is based on the rapid ionization of the bacteria/yeast ribosomal proteins using a laser pulse. The calculated mass of the ions is the specific sample fingerprint of the bacterial/yeast species | Yes, but protein profile of each species must be predetermined; poor for similar species | < 5.0—Bruker Daltonics | $0.50/sample and $150 000 for machinery [108] | A |
Spectrometry approaches combined with molecular tools | The PCR/electrospray-mass spectrometry (PCR/ESI-MS) is a fairly recent technology that couples a molecular method to a spectrometry approach | Yes, protein profile of each species must be predetermined | <6.0—Ibis Biosciences | $50–100 per sample and $450 000 for machinery [109] | A |
Technique . | Summary . | ID capability . | Time of fastest technique available for AST (h) . | Associated cost . | Automatic (A) or manual (M) . |
---|---|---|---|---|---|
Culture-based techniques | Bacteria inoculated on agar plates with different concentrations of antibiotics | Limited to metabolic activity of pathogen on substrate | 4.5–18 (MicroScan WalkAway) | Very expensive, exact figures are NA [9] | M |
Molecular-based methods | Utilizes conventional PCR or quantitative real-time PCR to amplify specific sequences of nucleic acids, allowing simultaneous pathogen ID and AST | Limited to recognition of specific predetermined sequences of AMR genes | 1.0 (BioFire) | >$49,000 for BioFire Assay >$100 per test [107] | A |
Spectrometry MALDI-TOF MS | MALDI-TOF MS is based on the rapid ionization of the bacteria/yeast ribosomal proteins using a laser pulse. The calculated mass of the ions is the specific sample fingerprint of the bacterial/yeast species | Yes, but protein profile of each species must be predetermined; poor for similar species | < 5.0—Bruker Daltonics | $0.50/sample and $150 000 for machinery [108] | A |
Spectrometry approaches combined with molecular tools | The PCR/electrospray-mass spectrometry (PCR/ESI-MS) is a fairly recent technology that couples a molecular method to a spectrometry approach | Yes, protein profile of each species must be predetermined | <6.0—Ibis Biosciences | $50–100 per sample and $450 000 for machinery [109] | A |
Technique . | Summary . | ID capability . | Time of fastest technique available for AST (h) . | Associated cost . | Automatic (A) or manual (M) . |
---|---|---|---|---|---|
Culture-based techniques | Bacteria inoculated on agar plates with different concentrations of antibiotics | Limited to metabolic activity of pathogen on substrate | 4.5–18 (MicroScan WalkAway) | Very expensive, exact figures are NA [9] | M |
Molecular-based methods | Utilizes conventional PCR or quantitative real-time PCR to amplify specific sequences of nucleic acids, allowing simultaneous pathogen ID and AST | Limited to recognition of specific predetermined sequences of AMR genes | 1.0 (BioFire) | >$49,000 for BioFire Assay >$100 per test [107] | A |
Spectrometry MALDI-TOF MS | MALDI-TOF MS is based on the rapid ionization of the bacteria/yeast ribosomal proteins using a laser pulse. The calculated mass of the ions is the specific sample fingerprint of the bacterial/yeast species | Yes, but protein profile of each species must be predetermined; poor for similar species | < 5.0—Bruker Daltonics | $0.50/sample and $150 000 for machinery [108] | A |
Spectrometry approaches combined with molecular tools | The PCR/electrospray-mass spectrometry (PCR/ESI-MS) is a fairly recent technology that couples a molecular method to a spectrometry approach | Yes, protein profile of each species must be predetermined | <6.0—Ibis Biosciences | $50–100 per sample and $450 000 for machinery [109] | A |
Review of current techniques
Table 1 provides a summarized review of the most common bacterial infection diagnostic techniques. Current standard diagnostic techniques are typically based on serology, including the extensive multistep processes to detect microbial growth, followed by the isolation and ID of samples, and then antimicrobial susceptibility testing (AST), making blood cultures laborious and time consuming [7, 8]. Additionally, the accuracy and sensitivity of the microbial profiles are poor [9]. However, their automation and high degree of standardization across the field means these techniques remain valuable [10].
The main advantage of most molecular-based methods over culture-based methods in diagnostics is the ability to perform tests without culture—saving significant time and labor—directly on the patient samples. Most of the contemporary technologies require a deoxyribonucleic acid (DNA) extraction step in which low volumes of clinical sample lead to lower volumes of DNA. This can lead to false-negative results, masking an infection and minimizing the clinical relevance of such technologies [11].
Molecular techniques can detect resistance traits in the pathogens, providing valuable information on the type of antimicrobial treatment to prescribe as well as on the identity of the pathogen [9]. A polymerase chain reaction (PCR) step is required to identify the resistance genes. However, this step requires previous knowledge of the sequences to amplify, and the presence of a resistance gene does not always correspond with the phenotypic resistance [12].
Relatively recent spectrometry methods offer new alternative technologies for AST. Such spectrometry methods include the matrix-assisted laser desorption/ionization time of flight mass spectrometry (MALDI-TOF MS). The MALDI-TOF MS holds utility as it does not require a unique test yet can be used for gram-positive and gram-negative bacteria and yeast, unlike other methods [13, 14]. Its ability to distinguish between species with similar ribosomal protein sequences is difficult, although a study by Fernández-Álvarez et al. [15] differentiated closely related Flavobacterium species (Flavobacterium flevense, Flavobacterium succinicans, Flavobacterium columnare, Flavobacterium branchiophilum and Flavobacterium johnsoniae) from Flavobacterium psychrophilum using ribosomal protein biomarkers and MALDI-TOF MS technology. In cases where ID is infeasible, supplementing with classical approaches is required. Other challenges for this technique include high amounts of host protein found in most clinical samples and time-consuming sample preparation that is still marginally less than cultures. This method also tends to under-evaluate infections with low bacterial counts, because of the low protein number, and can lead to false-negatives [9]. While the machinery for this technique is often very expensive, the cost of each test is inexpensive.
Spectrometry approaches combined with molecular tools, namely the IRIDICA BAC Bacterial Sepsis Infection (BSI) assay, overcomes weaknesses in the analysis of complex samples and performing of culture-independent analysis [16]. Similar to the MALDI-TOF MS approach, this method requires expensive machinery, however, the cost for each test is minimal. A point of significance is the rapid turnaround time, with results in under 6 h, plus the sample transport logistics time required for centralized testing.
Review of emerging techniques
In contrast to most current AST methods, technologies such as the oCelloScope—an image-based assay—can be used to perform AST in real time without the need to attach bacterial cells to an inert surface [9, 17]. Other technologies such as the Accelerate Pheno System can perform both ID and AST with high resolution, allowing the diagnosis of both mono- and poly-microbial infections. However, this technique still requires time-consuming blood cultures before its rapid assay results can be generated. Additionally, the limited support in species for bacterial ID and expensive machinery may prevent this technology from being applicable in a real-time clinical setting.
Many of the emerging biochemical methods share many common features, including their straightforward manipulation, low production costs and rapid turnaround time. Despite these favorable features, these methods mainly focus on AST with very limited capability for pathogen species ID. However, a notable biochemical method, the electronic nose (e-nose), does detect the chemical fingerprints of volatile organic compounds for pathogen ID [9]. The e-nose method can be done with minimal to no invasion and has a high specificity rate with a 10-min turnaround time. A drawback to the technology is its inability to identify and quantify chemical species in the usually complex volatile organic compounds (VOCs) mixtures. In terms of its clinical practicality, the e-nose has a large price and complexity of operation which requires expert operators—most likely limiting its implementation to hospital settings.
While enhancements in the accuracy of diagnostic tests for bacterial infections have improved therapies and avoided the unnecessary use of antibiotics in recent years, limitations still remain. The spectrum of pathogens and their varied clinical presentations and effects is wide; therefore, the current standards beg reexamination. Devising an effective method that can optimize timeliness and accuracy, while minimizing the cost and complexity, is still needed despite the recent advancements.
Table 2 shows that emerging trends such as imaging-based technologies and biochemical methods primarily focus on rapid AST, with limited capabilities of bacterial species/gene ID. Contrastingly, while e-nose technology is tailored for bacterial ID, its clinical roll out in hospital settings remains years away. An advantage of rapid bacterial antimicrobial resistance (AMR) gene ID is the accompanying information that informs a health-care practitioner of the appropriate antibiotic to prescribe to a patient, eliminating the need for AST. There are a growing number of databases that are continually updated, which offer drug options, such as those mentioned in Table 3, which depicts the integrated data and analysis tools required to support biomedical research on bacterial infectious diseases [18]. Thus, a rapid ID method for bacterial infections in combination with a growing AMR knowledge base could greatly improve the clinical treatment of bacterial infections.
Review of emerging technologies and future trends in microbial infection diagnostic techniques
Technique . | Summary . | ID capability . | AST capability . |
---|---|---|---|
Imaging-based technologies | Usage of microscopy techniques and image-based assays that can provide pathogen ID and AST | Limited range of species and poor resolution of ID | High AST performance |
Biochemical methods | Usage of biosensors to identify and detect bacterial growth through biochemical flags from cells | Limited to the interaction of bacteriophages with bacteria | High AST performance |
Electric nose | Recognize single chemical fingerprint patterns through an array of semi-selective sensors for VOCs | Yes, but VOC fingerprint must be predetermined. High resolution | AST is not a feature of these devices |
Technique . | Summary . | ID capability . | AST capability . |
---|---|---|---|
Imaging-based technologies | Usage of microscopy techniques and image-based assays that can provide pathogen ID and AST | Limited range of species and poor resolution of ID | High AST performance |
Biochemical methods | Usage of biosensors to identify and detect bacterial growth through biochemical flags from cells | Limited to the interaction of bacteriophages with bacteria | High AST performance |
Electric nose | Recognize single chemical fingerprint patterns through an array of semi-selective sensors for VOCs | Yes, but VOC fingerprint must be predetermined. High resolution | AST is not a feature of these devices |
Review of emerging technologies and future trends in microbial infection diagnostic techniques
Technique . | Summary . | ID capability . | AST capability . |
---|---|---|---|
Imaging-based technologies | Usage of microscopy techniques and image-based assays that can provide pathogen ID and AST | Limited range of species and poor resolution of ID | High AST performance |
Biochemical methods | Usage of biosensors to identify and detect bacterial growth through biochemical flags from cells | Limited to the interaction of bacteriophages with bacteria | High AST performance |
Electric nose | Recognize single chemical fingerprint patterns through an array of semi-selective sensors for VOCs | Yes, but VOC fingerprint must be predetermined. High resolution | AST is not a feature of these devices |
Technique . | Summary . | ID capability . | AST capability . |
---|---|---|---|
Imaging-based technologies | Usage of microscopy techniques and image-based assays that can provide pathogen ID and AST | Limited range of species and poor resolution of ID | High AST performance |
Biochemical methods | Usage of biosensors to identify and detect bacterial growth through biochemical flags from cells | Limited to the interaction of bacteriophages with bacteria | High AST performance |
Electric nose | Recognize single chemical fingerprint patterns through an array of semi-selective sensors for VOCs | Yes, but VOC fingerprint must be predetermined. High resolution | AST is not a feature of these devices |
Database . | Summary . | Suitable species for database [90] . |
---|---|---|
PATRIC | Provides integrated data and analysis tools to support biomedical research on bacterial infectious diseases [18] | Mycobacterium tuberculosis, S. pneumoniae, S. aureus, Acinetobacter baumannii—non-serovar Typhi S. enterica |
CARD | A bioinformatic database of resistance genes and proteins along with their corresponding phenotypes [110] | Salmonella enterica serovar Typhi, Shigella sonnei, non-serovar Typhi S. enterica |
ARG-ANNOT | Detects existing and putative new AR genes in bacterial genomes. ARG-ANNOT uses a local alignment program in Bio-Edit software that allows the user to analyze sequences without a web interface [111] | NA |
ResFinder | ResFinder identifies provided AMR genes and/or chromosomal mutations in total or partial sequenced isolates of bacteria [112] | Escherichia coli, Enterococcus faecalis, E. faecium |
PlasmidFinder | The service identifies plasmids in total or partial sequenced isolates of bacteria [113] | NA |
VFDB | VFDB is a repository to store, search, retrieve and update information about virulence factors from an assortment of bacterial pathogens [114] | NA |
Mykrobe | Analyzes the entire genome of a bacterial sample and predicts to which drugs the bacteria of question is resistant to—all within a short turnaround period [115] | M. tuberculosis, S. aureus |
PhyResSE | Determines strain lineage and AR of bacterial sample from NGS data [116] | M. tuberculosis |
Database . | Summary . | Suitable species for database [90] . |
---|---|---|
PATRIC | Provides integrated data and analysis tools to support biomedical research on bacterial infectious diseases [18] | Mycobacterium tuberculosis, S. pneumoniae, S. aureus, Acinetobacter baumannii—non-serovar Typhi S. enterica |
CARD | A bioinformatic database of resistance genes and proteins along with their corresponding phenotypes [110] | Salmonella enterica serovar Typhi, Shigella sonnei, non-serovar Typhi S. enterica |
ARG-ANNOT | Detects existing and putative new AR genes in bacterial genomes. ARG-ANNOT uses a local alignment program in Bio-Edit software that allows the user to analyze sequences without a web interface [111] | NA |
ResFinder | ResFinder identifies provided AMR genes and/or chromosomal mutations in total or partial sequenced isolates of bacteria [112] | Escherichia coli, Enterococcus faecalis, E. faecium |
PlasmidFinder | The service identifies plasmids in total or partial sequenced isolates of bacteria [113] | NA |
VFDB | VFDB is a repository to store, search, retrieve and update information about virulence factors from an assortment of bacterial pathogens [114] | NA |
Mykrobe | Analyzes the entire genome of a bacterial sample and predicts to which drugs the bacteria of question is resistant to—all within a short turnaround period [115] | M. tuberculosis, S. aureus |
PhyResSE | Determines strain lineage and AR of bacterial sample from NGS data [116] | M. tuberculosis |
Note. NA, not available.
Database . | Summary . | Suitable species for database [90] . |
---|---|---|
PATRIC | Provides integrated data and analysis tools to support biomedical research on bacterial infectious diseases [18] | Mycobacterium tuberculosis, S. pneumoniae, S. aureus, Acinetobacter baumannii—non-serovar Typhi S. enterica |
CARD | A bioinformatic database of resistance genes and proteins along with their corresponding phenotypes [110] | Salmonella enterica serovar Typhi, Shigella sonnei, non-serovar Typhi S. enterica |
ARG-ANNOT | Detects existing and putative new AR genes in bacterial genomes. ARG-ANNOT uses a local alignment program in Bio-Edit software that allows the user to analyze sequences without a web interface [111] | NA |
ResFinder | ResFinder identifies provided AMR genes and/or chromosomal mutations in total or partial sequenced isolates of bacteria [112] | Escherichia coli, Enterococcus faecalis, E. faecium |
PlasmidFinder | The service identifies plasmids in total or partial sequenced isolates of bacteria [113] | NA |
VFDB | VFDB is a repository to store, search, retrieve and update information about virulence factors from an assortment of bacterial pathogens [114] | NA |
Mykrobe | Analyzes the entire genome of a bacterial sample and predicts to which drugs the bacteria of question is resistant to—all within a short turnaround period [115] | M. tuberculosis, S. aureus |
PhyResSE | Determines strain lineage and AR of bacterial sample from NGS data [116] | M. tuberculosis |
Database . | Summary . | Suitable species for database [90] . |
---|---|---|
PATRIC | Provides integrated data and analysis tools to support biomedical research on bacterial infectious diseases [18] | Mycobacterium tuberculosis, S. pneumoniae, S. aureus, Acinetobacter baumannii—non-serovar Typhi S. enterica |
CARD | A bioinformatic database of resistance genes and proteins along with their corresponding phenotypes [110] | Salmonella enterica serovar Typhi, Shigella sonnei, non-serovar Typhi S. enterica |
ARG-ANNOT | Detects existing and putative new AR genes in bacterial genomes. ARG-ANNOT uses a local alignment program in Bio-Edit software that allows the user to analyze sequences without a web interface [111] | NA |
ResFinder | ResFinder identifies provided AMR genes and/or chromosomal mutations in total or partial sequenced isolates of bacteria [112] | Escherichia coli, Enterococcus faecalis, E. faecium |
PlasmidFinder | The service identifies plasmids in total or partial sequenced isolates of bacteria [113] | NA |
VFDB | VFDB is a repository to store, search, retrieve and update information about virulence factors from an assortment of bacterial pathogens [114] | NA |
Mykrobe | Analyzes the entire genome of a bacterial sample and predicts to which drugs the bacteria of question is resistant to—all within a short turnaround period [115] | M. tuberculosis, S. aureus |
PhyResSE | Determines strain lineage and AR of bacterial sample from NGS data [116] | M. tuberculosis |
Note. NA, not available.
DNA sequencing as a diagnostic tool
PCR-based testing has proven dominant among the molecular technologies for pathogen ID due to its speed, low complexity and cost. PCR though can be suboptimal in terms of resolving the strain or AMR [19], and it is sensitive to the a priori assumption of disease agents implied by the PCR panel selected [8]. Strain resolution can be improved by Sanger sequencing of the PCR products, but Sanger sequencing remains constrained by the pathogen hypotheses for which testing is performed.
Next-generation sequencing
An extant methodology for bacterial ID not yet mentioned is the next-generation sequencing (NGS) technology. While there are a variety of different types of NGS technologies, here, we use NGS as a shorthand to refer to second-generation sequencing technologies. NGS provides several possible applications and advantages to clinical microbiology. For instance, NGS can bypass many limitations of the current diagnostic approaches by enabling clinicians to assess for multiple pathogens during the initial diagnostic evaluation, thereby avoiding many rounds of testing to search for progressively less-common pathogens [20–22]. Furthermore, NGS has the potential to bypass the multitude of tests that are being conducted on clinical specimens [23, 24]. Using NGS to research the human microbiome in relation to health and disease has been critical for clinical applications such as fecal transplant therapy [25]. Another benefit of the NGS approach in clinical microbiology is the ability to develop new and enhanced diagnostic assays through the use of sequence information derived from NGS-generated data to improve specific DNA targets and primers used in multiplex assays [23, 26]. There are a variety of other ways that NGS can impact clinical microbiology that are detailed in the Applications of Clinical Microbial Next-Generation Sequencing: Report on an American Academy of Microbiology Colloquium, held in Washington, DC, in April 2015, but for the purposes of this review, we focus on AMR detection. There are a variety of high-throughput sequencing technologies available as commercial platforms, with Illumina’s sequencing-by-synthesis as the dominant player. Its ability to rapidly sequence entire bacterial genomes and microbiomes with high resolution and analyze these sequences using improving bioinformatic tools lend credibility to the idea of NGS’s clinical utility [27]. However, lack of practical NGS clinical adoption to date for clinical microbiology can be attributed to cost analysis in health economics, complex workflows, the need for quality controls and interfering contamination events [28, 29]. High throughput sequencing to guide clinical decision making could still be valuable if these downsides of NGS are addressed. Use of NGS in clinical microbiology is primarily limited to priority cases, such as outbreaks of food-borne illness [30–34] or highly pathogenic/drug-resistant diseases [35, 36] and in post hoc molecular epidemiology [29–32].
Nanopore sequencing
So-called third-generation sequencing (TGS) includes various technology platforms with the commonality that they sense the bases of individual DNA or ribonucleic acid (RNA) molecules, as opposed to NGS, where the signal is based on the amplification and synthesis of template DNA molecules. Raw TGS data are therefore more error prone, but methods to generate high-quality consensus from TGS signals approach the quality of individual NGS reads by using template molecule circularization and concatenation for repeated measurement [37, 38]. The leading platforms in the TGS space are the Pacific Biosciences and Oxford Nanopore Technologies (ONT). PacBio is similar to ONT in terms of long-read capability, but PacBio is more similar to core sequencing facility NGS platforms in terms of turnaround time for individual reads, capital cost and physical requirements (cooling, power, etc.). While we acknowledge there are other nanopore technologies that remain in the research and development stage (e.g. Stoddart et al. [39] and solid-state nanopore technology recently reviewed in Manrao et al. [40]), henceforth we will refer to the existing commercially available nanopore technology, ONT, whose use has been documented in a clinical setting [41]. The remainder of this review will therefore focus on the tools that are common to NGS and TGS infection analysis and on specific clinic-oriented aspects unique to the existing nanopore sequencing capabilities.
Recently, a variety of studies have demonstrated nanopore sequencing’s capability for bacterial ID and AMR gene detection in a clinical setting. Table 4 shows studies that utilized nanopore sequencing technologies for bacterial ID. Notable innovations within each study include improving accuracy and shortened turnaround times. In Matsumoto and colleagues’ [42] 2019 study, a hybrid assembly using sequencers from Illumina and ONT and the development of a novel multilocus sequence typing (MLST) database provided the most accurate detection of a variety of nontuberculous mycobacteria (NTM) among all currently used clinical methods of bacterial ID. In terms of improving turnaround times from sampling to results, Leggett and colleagues’ [43] 2019 study demonstrated a capacity of an optimized workflow to shorten turnaround times for fecal infant sample bacteria and AMR ID to less than 5 h.
Studies involving bacterial species and AMR gene ID using nanopore (ONT) technologies
Date of publication . | Study . | Description of method . | Bacterial species detected . | Reported time . | Reported nanopore accuracy . |
---|---|---|---|---|---|
23 July 2016 | Shin et al. 2016 [117] | 16S rRNA amplification and sequencing of entire mouse gut microbiota | 16 phylogenetically distinct species distributed in 13 genera | <3 h for library construction, real-time detection | 79.6% coverage |
26 September 2016 | Schmidt et al. [118] | WGS of bacterial DNA enrichment from urine samples using MolYsis and MagNA Pure Compact Nucleic Acid Isolation Kit or NEBNext Microbiome DNA Enrichment Kit | Klebsiella pneumoniae E. coli Escherichia cloacae | 4 h from sample to result | 70–85% coverage. 100% correct bacterial ID |
15 September 2017 | Kerkhof et al. 2017 [44] | 16S rRNA amplification and sequencing of soil samples mixed with bioreactor DNA | Acidovorax wautersii, Comomonas nitrativorans and Stenotrophomo-nas rhizophilia | 6 h MinION runs | 79–100% coverage |
31 October 2017 | Xia et al. 2017 [119] | WGS of coliform bacteria with hybrid assembly of nanopore and Illumina sequences. DNA from bacterial isolates were extracted through Fast DNA® Spin Kit for soil usage | Coliform bacteria | 30 h from sample to result | 84.6% accuracy in alignment with Illumina assembly |
1 November 2017 | Moon et al. 2017 [120] | 16S rRNA amplicon sequencing performed by MinION with PCR amplification of 16S rRNA genes. Full-length 16S rRNA genes were confirmed by Sanger sequencing | Campylobacter fetus | Unspecified—51 min to generate 43 044 reads | 99.9% correct alignment to Campylobacter genus, 89% and 60% correct alignment to correct species and subspecies, respectively |
18 June 2018 | Tanaka et al. 2018 [121] | WSG through usage of nanopore, Ilumina or hybrid assembly. DNA extracted through Wizard® Genomic DNA Purification Kit with NucleoSpin® Tissue—protocols corresponding to gram-negative bacteria | Vibrio aphrogenes, Vibrio algivorus, Vibrio casei, Vibrio litoralis and Vibrio rumoiensis | Unspecified | 100% coverage from Nanopore–MinION presents 99% average nucleotide identity (ANI) between Ponticus clade strains and 95–97% ANI between Splendidus clade strains |
19 October 2018 | Chalupowicz et al. 2018 [122] | NucleoSpin Plant II Midi Kit was used to extract DNA from leaf tissue, and the MasterPure Complete DNA Purification Kit was used to isolate total DNA from seeds, stems and fruits. The WIMP workflow was utilized for bacterial ID | Clavibacter michiganensis, Pseudomonas corrugata, Acidovorax citrulli, Pantoea agglomerans and Candidatus Phytoplasma aurantifolia | Pathogen ID within 1–2 h of nanopore sequencing | Mean coverage for bacteria was typically <1—with mean coverage = aligned bases divided by genome size |
October 2018 | Leggett et al. 2020 [123] | WGS of genomic DNA from a microbial mock community used in the Human Microbiome Project | 20 bacterial strain sequences were correctly identified | 4–5 h for pathogen and AMR detection from sample to result | 82–89% for both bacterial identity and sequence coverage |
4 December 2018 | Golparian et al. 2018 [52] | Gonococcal and clinical isolates were cultured, and the Wizard Genomic DNA Purification Kit was used for DNA isolation prior to WGS | Neisseria gonorrhoeae and the de novo assembly of gonococcal genomes | <24 h for sequencing | 95.2–98.3% coverage |
12 January 2019 | Kai et al. 2019 [124] | 16S rRNA gene amplicon nanopore sequencing—16S rRNA genes were amplified by PCR and bacterial DNA was purified by DNeasy Blood and Tissue Kit | E. coli and S. aureus | <2 h of analysis for bacterial ID | >90% correct read alignment |
18 March 2019 | McManus et al. 2019 [125] | Hybrid assemblies based on Illumina MiSeq short-read and Oxford Nanopore MinION long-read whole genome sequences of oral rinse samples. DNA was extracted from bacterial isolates using S. aureus Genotyping Kit 2 DNA microarray kit and the DNeasy Blood and Tissue kit | S. aureus | Unspecified | – |
27 March 2019 | Hamner et al. 2019 [126] | WGS of water samples after DNA isolation using PowerWater DNA isolation kit | Acidovorax and Aeromonas salmonicida among several others, including eukaryotic protists and fungi | Unspecified | 93–96% correct read alignment to reference genomes |
5 March 2019 | Lim et al. 2019 [127] | Genomic DNA from bacterial isolates was extracted using the OMEGA Bio-tek EZNA® bacterial DNA Kit. WGS by nanopore sequencing was applied to characterize AMR through functional annotative databases | Mannheimia haemolytica | <48 h for the genotyping of 12 microbial genomes | 91% correct assignment to species, 93% to correct genus and >99% to the correct family |
30 May 2019 | Kamathewatta et al. 2019 [47] | WGS of DNA from samples isolated using the Wizard Genomic DNA Purification Kit. Followed by 16S rRNA amplification | Genus level taxonomic classification to 83% and 94% of reads | <16 h for total sequence output, including human DNA | 80–90% sequence coverage |
24 June 2019 | Sakai et al. 2019 [128] | Bacterial DNA was extracted through bead beating and chemical lysis protocols. WGS was performed with the EPI2ME workflow—gram-positive bacterial ID indicated poor accuracy | ESBL-producing gram-negative bacteria | Approximately 30 min | 100% ID for gram-negative bacteria, but 60% accuracy for gram-positive strains |
9 July 2019 | Matsumoto et al. 2019 [42] | MLST-based ID in conjunction with hybrid assemblies from Illumina and nanopore sequencers. Genomic DNA from bacterial colonies were extracted through the DNeasy PowerSoil Kit | Mycobacterium avium complex, Mycobacterium kansasii and the Mycobacteroides abscessus complex | Detection of NTM within 10 min of nanopore sequencing | 16S rRNA gene sequence homology only correctly identified 11 of 175 NTM species with 99% identity—this does not include MLST scoring |
24 July 2019 | Bialasiewicz et al. 2019 [129] | WGS of blood culture sample | Capnocytophaga canimorsus | <18 h for total sequence output, including human DNA | Unspecified |
Date of publication . | Study . | Description of method . | Bacterial species detected . | Reported time . | Reported nanopore accuracy . |
---|---|---|---|---|---|
23 July 2016 | Shin et al. 2016 [117] | 16S rRNA amplification and sequencing of entire mouse gut microbiota | 16 phylogenetically distinct species distributed in 13 genera | <3 h for library construction, real-time detection | 79.6% coverage |
26 September 2016 | Schmidt et al. [118] | WGS of bacterial DNA enrichment from urine samples using MolYsis and MagNA Pure Compact Nucleic Acid Isolation Kit or NEBNext Microbiome DNA Enrichment Kit | Klebsiella pneumoniae E. coli Escherichia cloacae | 4 h from sample to result | 70–85% coverage. 100% correct bacterial ID |
15 September 2017 | Kerkhof et al. 2017 [44] | 16S rRNA amplification and sequencing of soil samples mixed with bioreactor DNA | Acidovorax wautersii, Comomonas nitrativorans and Stenotrophomo-nas rhizophilia | 6 h MinION runs | 79–100% coverage |
31 October 2017 | Xia et al. 2017 [119] | WGS of coliform bacteria with hybrid assembly of nanopore and Illumina sequences. DNA from bacterial isolates were extracted through Fast DNA® Spin Kit for soil usage | Coliform bacteria | 30 h from sample to result | 84.6% accuracy in alignment with Illumina assembly |
1 November 2017 | Moon et al. 2017 [120] | 16S rRNA amplicon sequencing performed by MinION with PCR amplification of 16S rRNA genes. Full-length 16S rRNA genes were confirmed by Sanger sequencing | Campylobacter fetus | Unspecified—51 min to generate 43 044 reads | 99.9% correct alignment to Campylobacter genus, 89% and 60% correct alignment to correct species and subspecies, respectively |
18 June 2018 | Tanaka et al. 2018 [121] | WSG through usage of nanopore, Ilumina or hybrid assembly. DNA extracted through Wizard® Genomic DNA Purification Kit with NucleoSpin® Tissue—protocols corresponding to gram-negative bacteria | Vibrio aphrogenes, Vibrio algivorus, Vibrio casei, Vibrio litoralis and Vibrio rumoiensis | Unspecified | 100% coverage from Nanopore–MinION presents 99% average nucleotide identity (ANI) between Ponticus clade strains and 95–97% ANI between Splendidus clade strains |
19 October 2018 | Chalupowicz et al. 2018 [122] | NucleoSpin Plant II Midi Kit was used to extract DNA from leaf tissue, and the MasterPure Complete DNA Purification Kit was used to isolate total DNA from seeds, stems and fruits. The WIMP workflow was utilized for bacterial ID | Clavibacter michiganensis, Pseudomonas corrugata, Acidovorax citrulli, Pantoea agglomerans and Candidatus Phytoplasma aurantifolia | Pathogen ID within 1–2 h of nanopore sequencing | Mean coverage for bacteria was typically <1—with mean coverage = aligned bases divided by genome size |
October 2018 | Leggett et al. 2020 [123] | WGS of genomic DNA from a microbial mock community used in the Human Microbiome Project | 20 bacterial strain sequences were correctly identified | 4–5 h for pathogen and AMR detection from sample to result | 82–89% for both bacterial identity and sequence coverage |
4 December 2018 | Golparian et al. 2018 [52] | Gonococcal and clinical isolates were cultured, and the Wizard Genomic DNA Purification Kit was used for DNA isolation prior to WGS | Neisseria gonorrhoeae and the de novo assembly of gonococcal genomes | <24 h for sequencing | 95.2–98.3% coverage |
12 January 2019 | Kai et al. 2019 [124] | 16S rRNA gene amplicon nanopore sequencing—16S rRNA genes were amplified by PCR and bacterial DNA was purified by DNeasy Blood and Tissue Kit | E. coli and S. aureus | <2 h of analysis for bacterial ID | >90% correct read alignment |
18 March 2019 | McManus et al. 2019 [125] | Hybrid assemblies based on Illumina MiSeq short-read and Oxford Nanopore MinION long-read whole genome sequences of oral rinse samples. DNA was extracted from bacterial isolates using S. aureus Genotyping Kit 2 DNA microarray kit and the DNeasy Blood and Tissue kit | S. aureus | Unspecified | – |
27 March 2019 | Hamner et al. 2019 [126] | WGS of water samples after DNA isolation using PowerWater DNA isolation kit | Acidovorax and Aeromonas salmonicida among several others, including eukaryotic protists and fungi | Unspecified | 93–96% correct read alignment to reference genomes |
5 March 2019 | Lim et al. 2019 [127] | Genomic DNA from bacterial isolates was extracted using the OMEGA Bio-tek EZNA® bacterial DNA Kit. WGS by nanopore sequencing was applied to characterize AMR through functional annotative databases | Mannheimia haemolytica | <48 h for the genotyping of 12 microbial genomes | 91% correct assignment to species, 93% to correct genus and >99% to the correct family |
30 May 2019 | Kamathewatta et al. 2019 [47] | WGS of DNA from samples isolated using the Wizard Genomic DNA Purification Kit. Followed by 16S rRNA amplification | Genus level taxonomic classification to 83% and 94% of reads | <16 h for total sequence output, including human DNA | 80–90% sequence coverage |
24 June 2019 | Sakai et al. 2019 [128] | Bacterial DNA was extracted through bead beating and chemical lysis protocols. WGS was performed with the EPI2ME workflow—gram-positive bacterial ID indicated poor accuracy | ESBL-producing gram-negative bacteria | Approximately 30 min | 100% ID for gram-negative bacteria, but 60% accuracy for gram-positive strains |
9 July 2019 | Matsumoto et al. 2019 [42] | MLST-based ID in conjunction with hybrid assemblies from Illumina and nanopore sequencers. Genomic DNA from bacterial colonies were extracted through the DNeasy PowerSoil Kit | Mycobacterium avium complex, Mycobacterium kansasii and the Mycobacteroides abscessus complex | Detection of NTM within 10 min of nanopore sequencing | 16S rRNA gene sequence homology only correctly identified 11 of 175 NTM species with 99% identity—this does not include MLST scoring |
24 July 2019 | Bialasiewicz et al. 2019 [129] | WGS of blood culture sample | Capnocytophaga canimorsus | <18 h for total sequence output, including human DNA | Unspecified |
Note. ESBL: extended spectrum beta-lactamase.
Studies involving bacterial species and AMR gene ID using nanopore (ONT) technologies
Date of publication . | Study . | Description of method . | Bacterial species detected . | Reported time . | Reported nanopore accuracy . |
---|---|---|---|---|---|
23 July 2016 | Shin et al. 2016 [117] | 16S rRNA amplification and sequencing of entire mouse gut microbiota | 16 phylogenetically distinct species distributed in 13 genera | <3 h for library construction, real-time detection | 79.6% coverage |
26 September 2016 | Schmidt et al. [118] | WGS of bacterial DNA enrichment from urine samples using MolYsis and MagNA Pure Compact Nucleic Acid Isolation Kit or NEBNext Microbiome DNA Enrichment Kit | Klebsiella pneumoniae E. coli Escherichia cloacae | 4 h from sample to result | 70–85% coverage. 100% correct bacterial ID |
15 September 2017 | Kerkhof et al. 2017 [44] | 16S rRNA amplification and sequencing of soil samples mixed with bioreactor DNA | Acidovorax wautersii, Comomonas nitrativorans and Stenotrophomo-nas rhizophilia | 6 h MinION runs | 79–100% coverage |
31 October 2017 | Xia et al. 2017 [119] | WGS of coliform bacteria with hybrid assembly of nanopore and Illumina sequences. DNA from bacterial isolates were extracted through Fast DNA® Spin Kit for soil usage | Coliform bacteria | 30 h from sample to result | 84.6% accuracy in alignment with Illumina assembly |
1 November 2017 | Moon et al. 2017 [120] | 16S rRNA amplicon sequencing performed by MinION with PCR amplification of 16S rRNA genes. Full-length 16S rRNA genes were confirmed by Sanger sequencing | Campylobacter fetus | Unspecified—51 min to generate 43 044 reads | 99.9% correct alignment to Campylobacter genus, 89% and 60% correct alignment to correct species and subspecies, respectively |
18 June 2018 | Tanaka et al. 2018 [121] | WSG through usage of nanopore, Ilumina or hybrid assembly. DNA extracted through Wizard® Genomic DNA Purification Kit with NucleoSpin® Tissue—protocols corresponding to gram-negative bacteria | Vibrio aphrogenes, Vibrio algivorus, Vibrio casei, Vibrio litoralis and Vibrio rumoiensis | Unspecified | 100% coverage from Nanopore–MinION presents 99% average nucleotide identity (ANI) between Ponticus clade strains and 95–97% ANI between Splendidus clade strains |
19 October 2018 | Chalupowicz et al. 2018 [122] | NucleoSpin Plant II Midi Kit was used to extract DNA from leaf tissue, and the MasterPure Complete DNA Purification Kit was used to isolate total DNA from seeds, stems and fruits. The WIMP workflow was utilized for bacterial ID | Clavibacter michiganensis, Pseudomonas corrugata, Acidovorax citrulli, Pantoea agglomerans and Candidatus Phytoplasma aurantifolia | Pathogen ID within 1–2 h of nanopore sequencing | Mean coverage for bacteria was typically <1—with mean coverage = aligned bases divided by genome size |
October 2018 | Leggett et al. 2020 [123] | WGS of genomic DNA from a microbial mock community used in the Human Microbiome Project | 20 bacterial strain sequences were correctly identified | 4–5 h for pathogen and AMR detection from sample to result | 82–89% for both bacterial identity and sequence coverage |
4 December 2018 | Golparian et al. 2018 [52] | Gonococcal and clinical isolates were cultured, and the Wizard Genomic DNA Purification Kit was used for DNA isolation prior to WGS | Neisseria gonorrhoeae and the de novo assembly of gonococcal genomes | <24 h for sequencing | 95.2–98.3% coverage |
12 January 2019 | Kai et al. 2019 [124] | 16S rRNA gene amplicon nanopore sequencing—16S rRNA genes were amplified by PCR and bacterial DNA was purified by DNeasy Blood and Tissue Kit | E. coli and S. aureus | <2 h of analysis for bacterial ID | >90% correct read alignment |
18 March 2019 | McManus et al. 2019 [125] | Hybrid assemblies based on Illumina MiSeq short-read and Oxford Nanopore MinION long-read whole genome sequences of oral rinse samples. DNA was extracted from bacterial isolates using S. aureus Genotyping Kit 2 DNA microarray kit and the DNeasy Blood and Tissue kit | S. aureus | Unspecified | – |
27 March 2019 | Hamner et al. 2019 [126] | WGS of water samples after DNA isolation using PowerWater DNA isolation kit | Acidovorax and Aeromonas salmonicida among several others, including eukaryotic protists and fungi | Unspecified | 93–96% correct read alignment to reference genomes |
5 March 2019 | Lim et al. 2019 [127] | Genomic DNA from bacterial isolates was extracted using the OMEGA Bio-tek EZNA® bacterial DNA Kit. WGS by nanopore sequencing was applied to characterize AMR through functional annotative databases | Mannheimia haemolytica | <48 h for the genotyping of 12 microbial genomes | 91% correct assignment to species, 93% to correct genus and >99% to the correct family |
30 May 2019 | Kamathewatta et al. 2019 [47] | WGS of DNA from samples isolated using the Wizard Genomic DNA Purification Kit. Followed by 16S rRNA amplification | Genus level taxonomic classification to 83% and 94% of reads | <16 h for total sequence output, including human DNA | 80–90% sequence coverage |
24 June 2019 | Sakai et al. 2019 [128] | Bacterial DNA was extracted through bead beating and chemical lysis protocols. WGS was performed with the EPI2ME workflow—gram-positive bacterial ID indicated poor accuracy | ESBL-producing gram-negative bacteria | Approximately 30 min | 100% ID for gram-negative bacteria, but 60% accuracy for gram-positive strains |
9 July 2019 | Matsumoto et al. 2019 [42] | MLST-based ID in conjunction with hybrid assemblies from Illumina and nanopore sequencers. Genomic DNA from bacterial colonies were extracted through the DNeasy PowerSoil Kit | Mycobacterium avium complex, Mycobacterium kansasii and the Mycobacteroides abscessus complex | Detection of NTM within 10 min of nanopore sequencing | 16S rRNA gene sequence homology only correctly identified 11 of 175 NTM species with 99% identity—this does not include MLST scoring |
24 July 2019 | Bialasiewicz et al. 2019 [129] | WGS of blood culture sample | Capnocytophaga canimorsus | <18 h for total sequence output, including human DNA | Unspecified |
Date of publication . | Study . | Description of method . | Bacterial species detected . | Reported time . | Reported nanopore accuracy . |
---|---|---|---|---|---|
23 July 2016 | Shin et al. 2016 [117] | 16S rRNA amplification and sequencing of entire mouse gut microbiota | 16 phylogenetically distinct species distributed in 13 genera | <3 h for library construction, real-time detection | 79.6% coverage |
26 September 2016 | Schmidt et al. [118] | WGS of bacterial DNA enrichment from urine samples using MolYsis and MagNA Pure Compact Nucleic Acid Isolation Kit or NEBNext Microbiome DNA Enrichment Kit | Klebsiella pneumoniae E. coli Escherichia cloacae | 4 h from sample to result | 70–85% coverage. 100% correct bacterial ID |
15 September 2017 | Kerkhof et al. 2017 [44] | 16S rRNA amplification and sequencing of soil samples mixed with bioreactor DNA | Acidovorax wautersii, Comomonas nitrativorans and Stenotrophomo-nas rhizophilia | 6 h MinION runs | 79–100% coverage |
31 October 2017 | Xia et al. 2017 [119] | WGS of coliform bacteria with hybrid assembly of nanopore and Illumina sequences. DNA from bacterial isolates were extracted through Fast DNA® Spin Kit for soil usage | Coliform bacteria | 30 h from sample to result | 84.6% accuracy in alignment with Illumina assembly |
1 November 2017 | Moon et al. 2017 [120] | 16S rRNA amplicon sequencing performed by MinION with PCR amplification of 16S rRNA genes. Full-length 16S rRNA genes were confirmed by Sanger sequencing | Campylobacter fetus | Unspecified—51 min to generate 43 044 reads | 99.9% correct alignment to Campylobacter genus, 89% and 60% correct alignment to correct species and subspecies, respectively |
18 June 2018 | Tanaka et al. 2018 [121] | WSG through usage of nanopore, Ilumina or hybrid assembly. DNA extracted through Wizard® Genomic DNA Purification Kit with NucleoSpin® Tissue—protocols corresponding to gram-negative bacteria | Vibrio aphrogenes, Vibrio algivorus, Vibrio casei, Vibrio litoralis and Vibrio rumoiensis | Unspecified | 100% coverage from Nanopore–MinION presents 99% average nucleotide identity (ANI) between Ponticus clade strains and 95–97% ANI between Splendidus clade strains |
19 October 2018 | Chalupowicz et al. 2018 [122] | NucleoSpin Plant II Midi Kit was used to extract DNA from leaf tissue, and the MasterPure Complete DNA Purification Kit was used to isolate total DNA from seeds, stems and fruits. The WIMP workflow was utilized for bacterial ID | Clavibacter michiganensis, Pseudomonas corrugata, Acidovorax citrulli, Pantoea agglomerans and Candidatus Phytoplasma aurantifolia | Pathogen ID within 1–2 h of nanopore sequencing | Mean coverage for bacteria was typically <1—with mean coverage = aligned bases divided by genome size |
October 2018 | Leggett et al. 2020 [123] | WGS of genomic DNA from a microbial mock community used in the Human Microbiome Project | 20 bacterial strain sequences were correctly identified | 4–5 h for pathogen and AMR detection from sample to result | 82–89% for both bacterial identity and sequence coverage |
4 December 2018 | Golparian et al. 2018 [52] | Gonococcal and clinical isolates were cultured, and the Wizard Genomic DNA Purification Kit was used for DNA isolation prior to WGS | Neisseria gonorrhoeae and the de novo assembly of gonococcal genomes | <24 h for sequencing | 95.2–98.3% coverage |
12 January 2019 | Kai et al. 2019 [124] | 16S rRNA gene amplicon nanopore sequencing—16S rRNA genes were amplified by PCR and bacterial DNA was purified by DNeasy Blood and Tissue Kit | E. coli and S. aureus | <2 h of analysis for bacterial ID | >90% correct read alignment |
18 March 2019 | McManus et al. 2019 [125] | Hybrid assemblies based on Illumina MiSeq short-read and Oxford Nanopore MinION long-read whole genome sequences of oral rinse samples. DNA was extracted from bacterial isolates using S. aureus Genotyping Kit 2 DNA microarray kit and the DNeasy Blood and Tissue kit | S. aureus | Unspecified | – |
27 March 2019 | Hamner et al. 2019 [126] | WGS of water samples after DNA isolation using PowerWater DNA isolation kit | Acidovorax and Aeromonas salmonicida among several others, including eukaryotic protists and fungi | Unspecified | 93–96% correct read alignment to reference genomes |
5 March 2019 | Lim et al. 2019 [127] | Genomic DNA from bacterial isolates was extracted using the OMEGA Bio-tek EZNA® bacterial DNA Kit. WGS by nanopore sequencing was applied to characterize AMR through functional annotative databases | Mannheimia haemolytica | <48 h for the genotyping of 12 microbial genomes | 91% correct assignment to species, 93% to correct genus and >99% to the correct family |
30 May 2019 | Kamathewatta et al. 2019 [47] | WGS of DNA from samples isolated using the Wizard Genomic DNA Purification Kit. Followed by 16S rRNA amplification | Genus level taxonomic classification to 83% and 94% of reads | <16 h for total sequence output, including human DNA | 80–90% sequence coverage |
24 June 2019 | Sakai et al. 2019 [128] | Bacterial DNA was extracted through bead beating and chemical lysis protocols. WGS was performed with the EPI2ME workflow—gram-positive bacterial ID indicated poor accuracy | ESBL-producing gram-negative bacteria | Approximately 30 min | 100% ID for gram-negative bacteria, but 60% accuracy for gram-positive strains |
9 July 2019 | Matsumoto et al. 2019 [42] | MLST-based ID in conjunction with hybrid assemblies from Illumina and nanopore sequencers. Genomic DNA from bacterial colonies were extracted through the DNeasy PowerSoil Kit | Mycobacterium avium complex, Mycobacterium kansasii and the Mycobacteroides abscessus complex | Detection of NTM within 10 min of nanopore sequencing | 16S rRNA gene sequence homology only correctly identified 11 of 175 NTM species with 99% identity—this does not include MLST scoring |
24 July 2019 | Bialasiewicz et al. 2019 [129] | WGS of blood culture sample | Capnocytophaga canimorsus | <18 h for total sequence output, including human DNA | Unspecified |
Note. ESBL: extended spectrum beta-lactamase.
The versatility of ONT sequencing has also been demonstrated to identify bacterial species via both whole genome approaches and PCR amplification of 16S ribosomal RNA (rRNA) sequences. While many of the studies used targeted rDNA or rRNA approaches to identify species in environmental samples [44–46], studies such as Kamathewatta et al. [47] sequenced clinical samples. A metagenomics approach to identify bacteria using ONT sequencing is becoming increasingly popular due to its rapid turnaround time and handheld convenience. Implementation in the hospital setting could markedly increase the utility of ONT in real-time point-of-care clinical bacterial pathogen diagnostics.
In terms of AMR detection, some studies in Table 4, such as Lim et al. (2019), showed the potential for AMR detection either through PCR amplification of the target gene or through whole genome sequencing and ID of the gene. ONT sequencing allows rapid ID of these genes in comparison to other NGS approaches.
ONT sequencing can overcome several challenges faced by NGS as a diagnostic tool. Generalizing and extending the workflows described in [42–47], we present and discuss the practicality of a pipeline that has the potential to streamline nanopore sequencing for clinical microbiological use. The ONT Flongle is a novel TGS sequencing instrument that utilizes cost-effective and real-time long-read sequencing technology to routinely generate individual read lengths of tens and thousands of nucleotides. Uniquely, ONT sequencers provide read-by-read data availability [48, 49], meaning there is no fixed minimum run time before the biological sample interpretation can commence. The clinical laboratory suitability of Flongle is also supported by its track record of sequencing genomes of bacterial and viral species to an accuracy needed for identifying and diagnosing AMR genes in bacteria [48, 50]. The Flongle is also an attractive option for the clinic based on its short turnaround times, low-cost and low space occupancy. Additionally, the recent production of the VolTRAX V2 is a piece of technology that can make TGS for microbiological applications viable. The VolTRAX is a small device designed to perform all of the molecular biological manipulations required to convert a raw biological sample to a form ready for analysis on a nanopore sensing device (i.e. MinION) without the need for human intervention.
Figure 1 illustrates how the aforementioned ONT sequencing devices can help automate and inform treatment options in a clinical setting, using ONT’s EPI2ME software (Figure 1) and existing NGS bioinformatic databases/tools for microbiology (PATRIC [18] and antimicrobial combinations network (AMCN) [51]) as representative steps in a pipeline.

End-to-end steps for rapid, largely automated clinical infection diagnostics, with steps in italics and example hardware/tools/databases in regular font.
Patient sample collection and preparation
While the VolTRAX can be utilized for the PCR amplification of the 16S–23S rRNA region for bacterial ID, the entire genome sequence of a potential pathogen is far more valuable for a number of reasons. Firstly, the 16S–23S rRNA region can only be used to identify the species, while in contrast, having the entire genome of a pathogen can help to both identify it and to characterize antibiotic resistance (AR) genes [17, 42]. Furthermore, having the genomic sequence of a pathogen is valuable for further needs such as molecular epidemiology [52, 53], or for tracking the evolution of the pathogen within the host if it gains AMR. For hard-to-culture bacteria such as tuberculosis-causing mycoplasma, genome-based AMR testing can be especially valuable [54].
The primary aim of this step is isolating reliable bacterial DNA from the host (human) sample for nanopore genomic sequencing, while maintaining low cost, labor and time consumption. In order to isolate the microbiome and deplete any human DNA contaminants in a reasonable time span—that also minimizes complexity—the usage of VolTRAX V2 with an affordable isolation kit and workflow may be warranted (Table 5).
Isolation kit . | Description . | Cost (USD) . | Time (h) . | Preps . | Cost per sample (USD) . |
---|---|---|---|---|---|
PSP Spin Stool DNA Plus Kit (Stratec Biomedical) | Requiring approximately 400 mg of human or animal stool samples, the protocol has optimized bacterial DNA yield and purity found in human gut microbiota. Produces a considerably higher yield of bacterial DNA when isolated from stool than QIAamp DNA Stool Mini Kit [55] | 345 | 1 | 50 | 6.90 |
PureLink Microbiome DNA Purification Kit (ThermoFisher Scientific) | Kit utilizes PureLink Column technology with optimal recovery of highly purified DNA. Also utilizes beads in conjunction with heat and mechanical disruption to lyse a variety of microorganisms. DNA is compatible with NGS for WGS | 311 | 1.0 | 50 | 6.22 |
Optimized workflow described by Ackerman et al. [55] | Through an optimized pipeline described in Ackerman et al. [55], isolated microbial community DNA yields were maximized greater than any individual isolation kit yield. DNA yields were maximized by combining modified DNA extraction methods such as lyticase/lysozyme digestion, bead beating, boil/freeze cycles, proteinase K and carrier DNA use. The downside to this methodology is its considerably longer preparation periods in comparison to commercially viable kits | +1200 | 2.5–3 | NA | NA |
QIAamp DNA Microbiome Kit | Novel protocol that aims to enzymatically deteriorate host/human DNA contaminants prior to bacterial cell lysis and attenuates biases attributed to variable susceptibility to lysis procedures in different bacterial species. Results in optimal representation of microbiome | 538 | 1.0 | 50 | 10.76 |
Microbiome DNA Isolation Kit (Norgen Biotek) | The process provides a simple and convenient DNA isolation protocol for a variety of microbiome samples preserved using Norgen’s Swab Collection and DNA Preservation Kit | 233 | 0.5 | 50 | 4.66 |
Isolation kit . | Description . | Cost (USD) . | Time (h) . | Preps . | Cost per sample (USD) . |
---|---|---|---|---|---|
PSP Spin Stool DNA Plus Kit (Stratec Biomedical) | Requiring approximately 400 mg of human or animal stool samples, the protocol has optimized bacterial DNA yield and purity found in human gut microbiota. Produces a considerably higher yield of bacterial DNA when isolated from stool than QIAamp DNA Stool Mini Kit [55] | 345 | 1 | 50 | 6.90 |
PureLink Microbiome DNA Purification Kit (ThermoFisher Scientific) | Kit utilizes PureLink Column technology with optimal recovery of highly purified DNA. Also utilizes beads in conjunction with heat and mechanical disruption to lyse a variety of microorganisms. DNA is compatible with NGS for WGS | 311 | 1.0 | 50 | 6.22 |
Optimized workflow described by Ackerman et al. [55] | Through an optimized pipeline described in Ackerman et al. [55], isolated microbial community DNA yields were maximized greater than any individual isolation kit yield. DNA yields were maximized by combining modified DNA extraction methods such as lyticase/lysozyme digestion, bead beating, boil/freeze cycles, proteinase K and carrier DNA use. The downside to this methodology is its considerably longer preparation periods in comparison to commercially viable kits | +1200 | 2.5–3 | NA | NA |
QIAamp DNA Microbiome Kit | Novel protocol that aims to enzymatically deteriorate host/human DNA contaminants prior to bacterial cell lysis and attenuates biases attributed to variable susceptibility to lysis procedures in different bacterial species. Results in optimal representation of microbiome | 538 | 1.0 | 50 | 10.76 |
Microbiome DNA Isolation Kit (Norgen Biotek) | The process provides a simple and convenient DNA isolation protocol for a variety of microbiome samples preserved using Norgen’s Swab Collection and DNA Preservation Kit | 233 | 0.5 | 50 | 4.66 |
Isolation kit . | Description . | Cost (USD) . | Time (h) . | Preps . | Cost per sample (USD) . |
---|---|---|---|---|---|
PSP Spin Stool DNA Plus Kit (Stratec Biomedical) | Requiring approximately 400 mg of human or animal stool samples, the protocol has optimized bacterial DNA yield and purity found in human gut microbiota. Produces a considerably higher yield of bacterial DNA when isolated from stool than QIAamp DNA Stool Mini Kit [55] | 345 | 1 | 50 | 6.90 |
PureLink Microbiome DNA Purification Kit (ThermoFisher Scientific) | Kit utilizes PureLink Column technology with optimal recovery of highly purified DNA. Also utilizes beads in conjunction with heat and mechanical disruption to lyse a variety of microorganisms. DNA is compatible with NGS for WGS | 311 | 1.0 | 50 | 6.22 |
Optimized workflow described by Ackerman et al. [55] | Through an optimized pipeline described in Ackerman et al. [55], isolated microbial community DNA yields were maximized greater than any individual isolation kit yield. DNA yields were maximized by combining modified DNA extraction methods such as lyticase/lysozyme digestion, bead beating, boil/freeze cycles, proteinase K and carrier DNA use. The downside to this methodology is its considerably longer preparation periods in comparison to commercially viable kits | +1200 | 2.5–3 | NA | NA |
QIAamp DNA Microbiome Kit | Novel protocol that aims to enzymatically deteriorate host/human DNA contaminants prior to bacterial cell lysis and attenuates biases attributed to variable susceptibility to lysis procedures in different bacterial species. Results in optimal representation of microbiome | 538 | 1.0 | 50 | 10.76 |
Microbiome DNA Isolation Kit (Norgen Biotek) | The process provides a simple and convenient DNA isolation protocol for a variety of microbiome samples preserved using Norgen’s Swab Collection and DNA Preservation Kit | 233 | 0.5 | 50 | 4.66 |
Isolation kit . | Description . | Cost (USD) . | Time (h) . | Preps . | Cost per sample (USD) . |
---|---|---|---|---|---|
PSP Spin Stool DNA Plus Kit (Stratec Biomedical) | Requiring approximately 400 mg of human or animal stool samples, the protocol has optimized bacterial DNA yield and purity found in human gut microbiota. Produces a considerably higher yield of bacterial DNA when isolated from stool than QIAamp DNA Stool Mini Kit [55] | 345 | 1 | 50 | 6.90 |
PureLink Microbiome DNA Purification Kit (ThermoFisher Scientific) | Kit utilizes PureLink Column technology with optimal recovery of highly purified DNA. Also utilizes beads in conjunction with heat and mechanical disruption to lyse a variety of microorganisms. DNA is compatible with NGS for WGS | 311 | 1.0 | 50 | 6.22 |
Optimized workflow described by Ackerman et al. [55] | Through an optimized pipeline described in Ackerman et al. [55], isolated microbial community DNA yields were maximized greater than any individual isolation kit yield. DNA yields were maximized by combining modified DNA extraction methods such as lyticase/lysozyme digestion, bead beating, boil/freeze cycles, proteinase K and carrier DNA use. The downside to this methodology is its considerably longer preparation periods in comparison to commercially viable kits | +1200 | 2.5–3 | NA | NA |
QIAamp DNA Microbiome Kit | Novel protocol that aims to enzymatically deteriorate host/human DNA contaminants prior to bacterial cell lysis and attenuates biases attributed to variable susceptibility to lysis procedures in different bacterial species. Results in optimal representation of microbiome | 538 | 1.0 | 50 | 10.76 |
Microbiome DNA Isolation Kit (Norgen Biotek) | The process provides a simple and convenient DNA isolation protocol for a variety of microbiome samples preserved using Norgen’s Swab Collection and DNA Preservation Kit | 233 | 0.5 | 50 | 4.66 |
The VolTRAX is capable of DNA isolation and PCR amplification on bacterial cultures, with equivalent performance in terms of read length distribution to manually perform the prep protocol with pipettes. Additionally, since there are several reagent ports on the device, the same sample can go through different protocols using different reagents simultaneously to select the best output for DNA quantity. If the DNA outputs are still very low, whole genome amplification can be applied to the resulting DNA to increase the concentration of DNA.
A review of current bacterial DNA isolation kits shows that most kits follow a general workflow that can be adapted to a variety of sample types, including blood, stool, urine, etc. The procedure adapted to the VolTRAX is shown in Figure 1, where a variety of reagents, buffers and lysis solutions are going to be present at each step. The VolTRAX applies a charge to the sample and reagents and moves them to a path programmed by the software. The device can also complete library preparation in tandem with the separation process essentially automating the preparation process—reducing complexity and associated contamination risks. A plausible clinical upside of using the VolTRAX is the automation of the entire preparation process. Potential users could insert the results of their DNA isolation kits into the VolTRAX with a specified protocol, and the rest of the preparation process would be completed by the machine. Depending on the selected protocol, preparation processes could be as short as approximately 1 h.
Some infections present particularly low abundances of microbial populations, and when DNA quantities are barely detectable, small variations in sample quantity or quality may cause misleading results [55]. Thus, choosing kits that produce enriched microbiome DNA from samples is essential. This consideration is critical, and thus choosing kits that optimize the DNA yield, time and cost are important decisions. Table 5 presents some of the commercial choices available for a variety of DNA isolation kits and protocols, with varied costs, turnaround times and preps.
For example, comparing the costs of commercially available isolation kits in accordance with purified DNA yields, the PSP spin tool DNA Plus Kit or the PureLink Microbiome DNA Purification kits are both economical and clinically applicable isolation kits (Table 5), and they have defined protocols for a wide variety of tissues. Although the optimized workflow described in Ackerman et al. [55] is promising, the complexity and economic burden of the preparation and longer wait time may pose an obstacle in widespread clinical adoption. In conjunction with VolTRAX, the kits could prepare clinical samples of patients for Oxford nanopore sequencing with minimal human intervention and in a time-efficient manner. A recent study also optimized the isolation process to eliminate 99.99% of the host nucleic acid contaminants before downstream nanopore sequencing of bacteria conferring lower respiratory infection [56]. The optimized methodology of the study provides turnaround times as short as 6 h—from sample collection to result—but could be simplified and hastened if partially automated with VolTRAX [56]. VolTRAX upholds its potential utility in this workflow as a result of its capabilities in incubation, PCR and library preparation automation, attenuating the complexity and further reducing the length of time of the novel pipeline lending credibility to its future clinical application.
Regarding library preparation, the VolTRAX Sequencing Kit generates sequencing libraries from extracted gDNA in 5–10 min hands-on time plus 45 min of automation. The kit is optimized for simplicity and speed rather than for obtaining maximum throughput. However, some limitations to this step are the cost at $50/reaction. Ultimately, less than 2 h are required from the isolated DNA to sequencing initiation. Additionally, flow cells are required for the sequencing reaction to occur on the Flongle. Each flow cell is listed at $90 USD.
Assuming the cost for each sample extraction is approximately $10 (QIAamp DNA Microbiome Kit), an additional $50 is required for the library preparation and $90 is required for the flow cell. Ultimately, $150 per sample is costly compared to the aforementioned techniques. However, as the VolTRAX technology gains momentum and advances are made, the cost of the library preparation kits and flow cells will likely decrease, and additional capabilities are required to reduce costs implemented, such as sample multiplexing when the volumes are higher.
Oxford nanopore sequencing
Through the combination of speed and thorough coverage of all microorganisms present, metagenomic sequencing-based approaches have the ability to overcome the limitations of both culture and PCR methods. NGS technologies such as Ion Torrent and Illumina are widely used for metagenomics sequencing, but they generally face the limiting requirement of the run to be complete prior to analysis. However, LiveKraken, a recent method, enables analysis of raw Illumina data prior to the completion of the sequencing [57]. The highly parallel throughput of Illumina devices makes it more economically suited to a centralized, batch processing setup. ONT sequencing brings the advantages of built-in real-time data analysis of small or no-batch samples and low capital cost, which are more suited to the constraints of the clinical setting.
Nanopore MinION sequencing technology is approaching comparable performance to NGS at the consensus level, while showing advantages in terms of the size of the device, simple library preparation workflow, real-time sequencing data generation and analysis, and most importantly, long-read length to provide higher accuracy in terms of complex repeat and/or rearrangement structures (including plasmids) related to AMR. Its main limitation has been per-sample cost. However, Flongle is designed to be the quickest, most accessible and cost-efficient sequencing system for smaller tests and experiments, making it ideal for point-of-care clinical use. It utilizes the same core nanopore technology as MinION, GridION and PromethION, offering direct DNA or RNA analysis, simple preparation, real-time data and analysis. The Flongle is ideal for small samples (such as a microbiome as opposed to the human genome), single runs (removes the issues of sample cross-contamination due to multiplexing and barcoding), performing rapid quality check and species ID.
This hypothesis-free approach is a form of metagenomics, and while metagenomics projects with high diversity of bacteria of bacterial populations may require larger sequencing platforms like the MinION that are less cost-effective, we used the working assumption that, in an acute infection, the etiology of the disease is a bacterium in relatively high abundance. With this assumption, the lower throughput of Flongle will suffice for accurate detection. We realize that, depending on the type of infection, this requires a careful selection by the clinician of the sampling site [29].
Species ID
Currently, NGS as a tool for microbial ID utilizes the sequencing of the 16S–23S rRNA region as it is unique and is found in all bacteria [58]. This method has been reported to be more accurate than other commonly used ID methods that may identify bacteria on the basis of phenotype rather than sequence [19]. NGS allows for the simultaneous ID of several pathogens; however, while this method has very high resolution, some bacterial species can be too genetically similar to differentiate [58]. This issue was recently overcome in a method devised by Sabat et al. [58]. While the NGS of 16S–23S rRNA region is culture-independent and more discriminative at the species and genus level than popular approaches such as the MALDI-TOF MS, its main limitations come from cost and speed. With costs per sample—not including labor—amounting to approximately $75 USD. More importantly, the practicality of such a method in clinical microbiology is infeasible as turnaround times can reach 4 days. Another considerable limitation is the lack of easily implementable and utilizable reference databases for 16S–23S rRNA sequences and complementary software for automated, easy and reliable species ID. While reference 16S–23S rRNA databases do exist, such as Greengene, RDP and SILVA, they require software such as Kraken, QIIME and MOTHUR for use [23, 59–64]. These analyses are computationally intensive and primarily intended for card-carrying bioinformaticians comfortable with the command line as opposed to regular clinicians [23, 59].
A relatively novel pipeline that could be utilized in real-time ID of bacterial species utilizing nanopore technology involves the WIMP workflow. Centrifuge technology allows rapid and accurate ID of potentially pathogenic bacteria through quantification and labeling of reads [56]. The tool utilizes k-mer-based ID indexing methods centered around Burrow-Wheelers Transform (BWT) and the Ferragina-Manzini (FM) index to classify reads into species by referencing RefSeq in real time [56, 65]. Ultimately, the WIMP workflow classifies and identifies species in real time: as soon as a strand of DNA passes through the pore, it can be base-called and analyzed. WIMP is a part of an EPI2ME AMR pipeline (ONT) that can be used to identify bacterial, viral and human reads.
This pipeline was used within Charalampous et al.’s [56] 2019 study of the ID of the lower respiratory infection-conferring bacteria. A drawback to the usage of WIMP is its relatively high false-positive detection rates as evidenced in the study. High misID rates can be attributed to the inefficacy of k-mer-based read ID methods in differentiating highly similar genomes of closely related organisms (at the species level). False-positives attributable to poor k-mer classification could be mitigated by lightweight scripting that highlights those reads that are assigned to the same species by multiple classification programs. Since each tool has slight algorithmic variations and underlying database indices, commonly classified reads are more likely to be true positives, with some chance of losing true positives that are called by only one tool. This combine-existing-predictions approach is widely used in bioinformatics when false-positive predictions are troublesome, for example, de novo transcriptome assembly, gene splicing and miRNA targeting [66–68].
Despite WIMP’s limitations, its potential for point-of-care clinical utility to identify species in conjunction with Oxford Nanopore Sequencing in real time is valuable. As the program improves through updates, so will the number of bacterial reference genomes it possesses. The output from the program can be pipelined to a bacterial database, where only species that are harmful to the patient are selected.
The repertoire of extant species and AMR genes that are relevant to defining the etiology of disease are both incomplete in the databases and are ever expanding (i.e. emerging pathogens). This highlights both the importance of maintaining up-to-date versions of AMR gene databases (discussed below) and the opportunity for using clinical sequencing to supplement existing pathogen sentinel programs (the complexity of which is beyond the scope of this work, see for clinical and veterinary reviews [69, 70]).
AMR gene detection
The implementation of wide genome sequencing (WGS) in AMR detection in clinical microbiology has been contentious until recently. Like many other sequencing techniques to determine AMR in bacterial strains (PCR, microarray, etc.), WGS has the ability to detect genetic factors that confer AMR such as genes and mutations [71]. Genomics-based prediction of antimicrobial susceptibility has been shown to be up to 98% concordant with minimum inhibitory concentration (MIC) testing by dilution [72], which gives equivalent results as standard AST disk tests [73]. However, WGS offers the advantage over PCR by being more efficient in bacterial AMR detection, where resistance is predominantly induced by mutational factors [11]. Furthermore, WGS displays strong suits in its ability to detect multiple different targets concurrently to rapidly record novel target sequences into databases and curate gene variants of target sequences [71]. Setbacks in the forms of economic feasibility and clinical inefficiency hinder WGS application to clinical microbiology, but newer technologies from Oxford Nanopore Sequencing mitigate these [71, 73].
In recent years, WGS has been carried out by complex and relatively inexpensive per-base cost sequencing technology, primarily machines manufactured by Illumina and to a lesser extent by Thermo Fisher (Ion Torrent). These ‘NGS’ technology platforms were a large step up from the Sanger sequencing sequence in terms of parallel data generation capacity and hence shortened turnaround times for genome-scale projects [71]. NGS machines produce relatively short but highly accurate sequence reads that are comparable to conventional Sanger sequencing methods. These highly accurate but very short read sequences may not unambiguously resolve an AMR gene, even after assembly, especially if repeated sequences are present [71, 74] and as AMR genes often reside in genomic islands containing nearly identical genetic mobility genes and insertion sequences [75]. As a consequence of the short-read lengths, assemblies of bacterial genomes that are enriched for these repeats are often profoundly fragmented and usually incomplete, hindering the potential for clinical usage [48]. However, the NGS approach for AMR detection in clinical microbiology has been shown by a study in Texas where WGS via the Illumina technology accurately predicted resistance to four β-lactams in major gram-negative bacterial pathogens commonly used in the treatment of neutropenic fever [76]. The study demonstrated the merit of WGS usage specifically in regard to detecting insertions, deletions and single-nucleotide polymorphisms (SNPs) that widely used PCR assays fail to detect in certain AMR bacterial species. Although the study utilized manual sequence inspection which was time consuming and not ideal for clinical microbiology, further advancements in the field could circumvent these obstacles [76]. Another noteworthy prerequisite that must be considered when clinically applying NGS is the risk of the full genome analysis of off-target human DNA in samples being sequenced due to the techniques’ sensitivity. The aforementioned approaches to bacterial enrichment will hopefully mitigate this problem [71].
To optimize the clinical usage of WGS in AMR detection, some have proposed the potential usage of Oxford Nanopore Sequencing in conjunction with the NGS technology [74, 77] to reduce the error susceptibility and improve the repeats resolution of the latter. These so-called hybrid assemblies are typically accomplished with SPAdes [78] or MaSURCA [79], and the results are often polished with Pilon [80]. Wrappers combining these tools into an assembly pipeline are available, and the foremost among them being UniCycler [81, 82]. npScarf can do the hybrid assembly as the data are being produced off the MinION/Flongle [83], allowing parsimonious reuse of flow cells. Another strategy that has been formerly deployed is to polish the long-reads with the short read information before assembly with PBcR [84], Nanocorr [84] or NaS [85], though the need for such correction has been greatly diminished by improvements in the ONT-provided base-caller Guppy and nanopore chemistry since 2017.
For use at point of care, long-read-only assembly is the most practical strategy [86], with a number of software packages now being able to effectively deal with modern nanopore raw read error rates of ~5%. Popular long-read-only assemblers include canu [87], shasta [88], Flye [89], wtdbg [90], raven [91], miniasm [92], reviewed in [93], and ABruijn [94]. With 100× coverage (easily achievable with a Flongle for most microbes), most tools provide single contig assembly for the genome and good elucidation of plasmids within minutes. These assemblies can be polished with racon [95], medaka (ONT) or nanopolish [96]. The practicalities of some of these tool combinations are explored in Senol et. al’s 2019 study [97]. In practice, the vast majority of errors that remain after long-read assembly and polishing of recent nanopore-only data are in homopolymer stretches of greater than 6 bases with the commonly used R9.4.1 pore, though even this is mitigated with newer R10.3 pores that have a larger contact interface, allowing accurate homopolymer sequencing up to 10 bases. These errors can also be mitigated in R9.4.1 by training the deep neural network base-callers in a species-specific manner [98], though this is counter to the wide-range microbial detection that is sought in point-of-care applications. If homopolymers are miscalled relative to reference microbial gene, it is reasonable to assume the reference homopolymer length is correct in most cases, where with reference-biased correction, we would err on the side of overpredicting rather than underpredicting AMR. Popular software for assessing quality of assemblies enumerate potential frameshift and the presence of essential one-copy genes, and these software include BUSCO [99], CheckM [100] or Quast [101].
Once the patient sample is sequenced with this method, the sequence could be compared to reference sequences of known bacterial strains and AMR genes for real-time diagnosis [48]. Reference sequences for AMR gene detection are growing in variety, given the expanding library of AMR gene databases such as the Comprehensive Antibiotic Resistance Database (CARD), RAST and ResFinder. The detection of AMR genes in a similar fashion were demonstrated in Lemon et al.’s [48] study involving MinION ID of plasmid AMR genes with polishing from MiSeq to provide accurate reads. It is noteworthy that the study focused on sequencing plasmids. However, with technological improvements in ONT and informatics, the entire bacterial genome including plasmids can be sequenced for AMR conferring genes or mutations in real time. Among others, common bioinformatic tools that could be used for this workflow are listed in Table 6.
Tool . | Summary . | Matching technique . | Advantages and disadvantages . |
---|---|---|---|
ABRicate | Tool used for mass screening of contigs for AMR or virulence genes with eight pre-downloaded databases [130] | Uses BLAST [131] to screen whole genome against multiple databases | Can be bundled with multiple databases: NCBI, CARD, ARG-ANNOT, Resfinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB. Only supports contigs and not FASTQ reads—only detects acquired resistance genes and not point mutations |
RGI | Software to predict resistomes, based on protein or nucleotide data, as well as metagenomics data, based on homology and SNP models [110] | (i) Predict protein sequences using Prodigal [132] | Specifically designed for assembly contigs or whole genomes—effective based on CARD’s canonical reference sequences. Has a false-negative rate due to a lack of sequence diversity found in CARD. Also contains false-positives for a specific epidemic strain of P. aeruginosa due to gaps in CARD curation [134] |
(ii) Homolog detection using DIAMOND [133] for protein models. | |||
(iii) BLASTP for variant protein models | |||
(iv) BLASTN for rRNA mutation models | |||
(v) BLASTP for protein overexpression models | |||
SRST2 | Short-read sequence typing for bacterial pathogens [135], which includes various database versions from Table 4 and others | Performs of genes and alleles direct from short sequencing reads using any sequence database(s) and calculates combinatorial sequence types defined in MLST-style databases | Limitations for SRST2 include: paired-end reads have to be in the FR orientation, mate-pair samples may not be supported, poor sample alignment with reference gene databases can produce errors in result outputs and all samples must contain UNIQUE sample names |
AMRFinder | Tool that identifies AMR genes using either protein annotations or nucleotide sequence [136] | Uses a database of protein-based HMMs (gene prediction files) to find novel AMR genes | As of right now, does not include the tools necessary to analyze adaptive resistance mutations like point mutations in rRNA genes or promoter-related mutations |
KmerResistance | Correlates mapped genes with the predicted species of WGS sample [137, 138] | Examines the co-occurrence of k-mers between the WGS data and a database of resistance genes | Performs better than other known methods when data are contaminated or contain small amounts of sequence reads [137]. Cannot identify or interpret SNP variants that confer resistance—must identify resistance from specific genes or predefined alleles |
SSTAR | AMR gene predictor [139] | Combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying AMR (AR) genes from genomic data | Can seamlessly apply any AMR database of interest for reference comparisons and can manually specify/add genes that may impact resistance [139] |
MEGAres | Contains sequence data for approximately 8000 hand-curated AMR genes accompanied by an annotation structure that is optimized for use with high throughput sequencing [140] | Uses a translated BLAST search and USEARCH [140] to search annotations made through a unique hierarchical structure against a hand-curated MEGAres database for AMR genes | Does not provide detailed and comprehensive gene descriptions and multiple category annotations in order to provide simpler and hierarchical annotation schemes instead [140] |
Ariba | AMR gene ID By assembly directly from short reads [141] | Uses BLAST to match reference genes against the assembled genome from short reads | Highly customizable and can be seamlessly compared with phenotypic resistance data. However, not recommended to be used with long-read technologies such as Oxford Nanopore. Samples can only have one gene per reference cluster—not suggested for metagenomic data [141] |
Tool . | Summary . | Matching technique . | Advantages and disadvantages . |
---|---|---|---|
ABRicate | Tool used for mass screening of contigs for AMR or virulence genes with eight pre-downloaded databases [130] | Uses BLAST [131] to screen whole genome against multiple databases | Can be bundled with multiple databases: NCBI, CARD, ARG-ANNOT, Resfinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB. Only supports contigs and not FASTQ reads—only detects acquired resistance genes and not point mutations |
RGI | Software to predict resistomes, based on protein or nucleotide data, as well as metagenomics data, based on homology and SNP models [110] | (i) Predict protein sequences using Prodigal [132] | Specifically designed for assembly contigs or whole genomes—effective based on CARD’s canonical reference sequences. Has a false-negative rate due to a lack of sequence diversity found in CARD. Also contains false-positives for a specific epidemic strain of P. aeruginosa due to gaps in CARD curation [134] |
(ii) Homolog detection using DIAMOND [133] for protein models. | |||
(iii) BLASTP for variant protein models | |||
(iv) BLASTN for rRNA mutation models | |||
(v) BLASTP for protein overexpression models | |||
SRST2 | Short-read sequence typing for bacterial pathogens [135], which includes various database versions from Table 4 and others | Performs of genes and alleles direct from short sequencing reads using any sequence database(s) and calculates combinatorial sequence types defined in MLST-style databases | Limitations for SRST2 include: paired-end reads have to be in the FR orientation, mate-pair samples may not be supported, poor sample alignment with reference gene databases can produce errors in result outputs and all samples must contain UNIQUE sample names |
AMRFinder | Tool that identifies AMR genes using either protein annotations or nucleotide sequence [136] | Uses a database of protein-based HMMs (gene prediction files) to find novel AMR genes | As of right now, does not include the tools necessary to analyze adaptive resistance mutations like point mutations in rRNA genes or promoter-related mutations |
KmerResistance | Correlates mapped genes with the predicted species of WGS sample [137, 138] | Examines the co-occurrence of k-mers between the WGS data and a database of resistance genes | Performs better than other known methods when data are contaminated or contain small amounts of sequence reads [137]. Cannot identify or interpret SNP variants that confer resistance—must identify resistance from specific genes or predefined alleles |
SSTAR | AMR gene predictor [139] | Combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying AMR (AR) genes from genomic data | Can seamlessly apply any AMR database of interest for reference comparisons and can manually specify/add genes that may impact resistance [139] |
MEGAres | Contains sequence data for approximately 8000 hand-curated AMR genes accompanied by an annotation structure that is optimized for use with high throughput sequencing [140] | Uses a translated BLAST search and USEARCH [140] to search annotations made through a unique hierarchical structure against a hand-curated MEGAres database for AMR genes | Does not provide detailed and comprehensive gene descriptions and multiple category annotations in order to provide simpler and hierarchical annotation schemes instead [140] |
Ariba | AMR gene ID By assembly directly from short reads [141] | Uses BLAST to match reference genes against the assembled genome from short reads | Highly customizable and can be seamlessly compared with phenotypic resistance data. However, not recommended to be used with long-read technologies such as Oxford Nanopore. Samples can only have one gene per reference cluster—not suggested for metagenomic data [141] |
Note. FR: Forward; HMM: hidden markov model.
Tool . | Summary . | Matching technique . | Advantages and disadvantages . |
---|---|---|---|
ABRicate | Tool used for mass screening of contigs for AMR or virulence genes with eight pre-downloaded databases [130] | Uses BLAST [131] to screen whole genome against multiple databases | Can be bundled with multiple databases: NCBI, CARD, ARG-ANNOT, Resfinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB. Only supports contigs and not FASTQ reads—only detects acquired resistance genes and not point mutations |
RGI | Software to predict resistomes, based on protein or nucleotide data, as well as metagenomics data, based on homology and SNP models [110] | (i) Predict protein sequences using Prodigal [132] | Specifically designed for assembly contigs or whole genomes—effective based on CARD’s canonical reference sequences. Has a false-negative rate due to a lack of sequence diversity found in CARD. Also contains false-positives for a specific epidemic strain of P. aeruginosa due to gaps in CARD curation [134] |
(ii) Homolog detection using DIAMOND [133] for protein models. | |||
(iii) BLASTP for variant protein models | |||
(iv) BLASTN for rRNA mutation models | |||
(v) BLASTP for protein overexpression models | |||
SRST2 | Short-read sequence typing for bacterial pathogens [135], which includes various database versions from Table 4 and others | Performs of genes and alleles direct from short sequencing reads using any sequence database(s) and calculates combinatorial sequence types defined in MLST-style databases | Limitations for SRST2 include: paired-end reads have to be in the FR orientation, mate-pair samples may not be supported, poor sample alignment with reference gene databases can produce errors in result outputs and all samples must contain UNIQUE sample names |
AMRFinder | Tool that identifies AMR genes using either protein annotations or nucleotide sequence [136] | Uses a database of protein-based HMMs (gene prediction files) to find novel AMR genes | As of right now, does not include the tools necessary to analyze adaptive resistance mutations like point mutations in rRNA genes or promoter-related mutations |
KmerResistance | Correlates mapped genes with the predicted species of WGS sample [137, 138] | Examines the co-occurrence of k-mers between the WGS data and a database of resistance genes | Performs better than other known methods when data are contaminated or contain small amounts of sequence reads [137]. Cannot identify or interpret SNP variants that confer resistance—must identify resistance from specific genes or predefined alleles |
SSTAR | AMR gene predictor [139] | Combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying AMR (AR) genes from genomic data | Can seamlessly apply any AMR database of interest for reference comparisons and can manually specify/add genes that may impact resistance [139] |
MEGAres | Contains sequence data for approximately 8000 hand-curated AMR genes accompanied by an annotation structure that is optimized for use with high throughput sequencing [140] | Uses a translated BLAST search and USEARCH [140] to search annotations made through a unique hierarchical structure against a hand-curated MEGAres database for AMR genes | Does not provide detailed and comprehensive gene descriptions and multiple category annotations in order to provide simpler and hierarchical annotation schemes instead [140] |
Ariba | AMR gene ID By assembly directly from short reads [141] | Uses BLAST to match reference genes against the assembled genome from short reads | Highly customizable and can be seamlessly compared with phenotypic resistance data. However, not recommended to be used with long-read technologies such as Oxford Nanopore. Samples can only have one gene per reference cluster—not suggested for metagenomic data [141] |
Tool . | Summary . | Matching technique . | Advantages and disadvantages . |
---|---|---|---|
ABRicate | Tool used for mass screening of contigs for AMR or virulence genes with eight pre-downloaded databases [130] | Uses BLAST [131] to screen whole genome against multiple databases | Can be bundled with multiple databases: NCBI, CARD, ARG-ANNOT, Resfinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB. Only supports contigs and not FASTQ reads—only detects acquired resistance genes and not point mutations |
RGI | Software to predict resistomes, based on protein or nucleotide data, as well as metagenomics data, based on homology and SNP models [110] | (i) Predict protein sequences using Prodigal [132] | Specifically designed for assembly contigs or whole genomes—effective based on CARD’s canonical reference sequences. Has a false-negative rate due to a lack of sequence diversity found in CARD. Also contains false-positives for a specific epidemic strain of P. aeruginosa due to gaps in CARD curation [134] |
(ii) Homolog detection using DIAMOND [133] for protein models. | |||
(iii) BLASTP for variant protein models | |||
(iv) BLASTN for rRNA mutation models | |||
(v) BLASTP for protein overexpression models | |||
SRST2 | Short-read sequence typing for bacterial pathogens [135], which includes various database versions from Table 4 and others | Performs of genes and alleles direct from short sequencing reads using any sequence database(s) and calculates combinatorial sequence types defined in MLST-style databases | Limitations for SRST2 include: paired-end reads have to be in the FR orientation, mate-pair samples may not be supported, poor sample alignment with reference gene databases can produce errors in result outputs and all samples must contain UNIQUE sample names |
AMRFinder | Tool that identifies AMR genes using either protein annotations or nucleotide sequence [136] | Uses a database of protein-based HMMs (gene prediction files) to find novel AMR genes | As of right now, does not include the tools necessary to analyze adaptive resistance mutations like point mutations in rRNA genes or promoter-related mutations |
KmerResistance | Correlates mapped genes with the predicted species of WGS sample [137, 138] | Examines the co-occurrence of k-mers between the WGS data and a database of resistance genes | Performs better than other known methods when data are contaminated or contain small amounts of sequence reads [137]. Cannot identify or interpret SNP variants that confer resistance—must identify resistance from specific genes or predefined alleles |
SSTAR | AMR gene predictor [139] | Combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying AMR (AR) genes from genomic data | Can seamlessly apply any AMR database of interest for reference comparisons and can manually specify/add genes that may impact resistance [139] |
MEGAres | Contains sequence data for approximately 8000 hand-curated AMR genes accompanied by an annotation structure that is optimized for use with high throughput sequencing [140] | Uses a translated BLAST search and USEARCH [140] to search annotations made through a unique hierarchical structure against a hand-curated MEGAres database for AMR genes | Does not provide detailed and comprehensive gene descriptions and multiple category annotations in order to provide simpler and hierarchical annotation schemes instead [140] |
Ariba | AMR gene ID By assembly directly from short reads [141] | Uses BLAST to match reference genes against the assembled genome from short reads | Highly customizable and can be seamlessly compared with phenotypic resistance data. However, not recommended to be used with long-read technologies such as Oxford Nanopore. Samples can only have one gene per reference cluster—not suggested for metagenomic data [141] |
Note. FR: Forward; HMM: hidden markov model.
Many of the tools mentioned in Table 6, such as ABRicate, SRST2, SSTAR, Resistance Gene Identifier (RGI), MEGARes and ARIBA, possess the ability to screen a customizable database that can be made of multiple resistance gene databases as they are based on Basic Local Alignment Search Tool (BLAST). Thus, the screening against all available resistance gene databases can be automated into one workflow. The other tools take different approaches. For instance, AMRFinder uses gene prediction to find novel AMR genes, which is advantageous for new discoveries and to address poor results from aforementioned programs. KmerResistance also provides an opportunity for poor quality reads and samples with limited reads to have clinical value, as it does not exclusively depend on sequence identity when comparing against a database.
The aforementioned EPI2ME pipeline holds large merit in its optimized bacterial ID capabilities through WIMP; however, the EPI2ME pipeline can also identify AMR genes in identified sample bacteria through the ARMA—AMR tool. ARMA is a workflow integrated with CARD analyzing FASTQ data for AMR-conferring genes found within the bacterial genomes of species identified with WIMP [56]. With growing databases in both RefSeq and CARD, the EPI2ME pipeline for ONT technologies such as the MinION could be ideal for diagnosing a variety of bacterial infections and for optimizing treatment options depending on the AMR genes present. We propose that the usage of the EPI2ME pipeline in addition to Flongle and VolTRAX technologies could largely automate patient sample preparation processes, considerably reduce turnaround times and attenuate the complexity of convoluted workflows—upholding its potential point-of-care clinical applicability and utility.
Adjusting patient-specific antibiotic therapy
Conventionally applied tools for AMR screening are time consuming and resource heavy [102]. These conventional methods of AMR detection in bacterial isolates have lengthy turnaround times of weeks and are limited by the panel of antibiotics that are used to determine MIC in accordance with sensitivity or resistance to antibiotics [102, 103]. The usage of the Flongle in AMR determination through the EPI2ME workflow would avoid the lengthy turnaround time and identify the resistance to certain antibiotics as well as associated resistance mechanisms in AMR bacterial samples. Consequently, antibiotic susceptibility could be deduced from the sequencing results of AMR genes and could be used to treat patients accordingly. PATRIC is one of a number of growing databases that can help inform users of antibiotic therapies based on the pathogen species and the AMR genes present [18]. Another growing database that can be used to specify treatment and avoid the use of broad-spectrum antibiotics is the National Database of Antibiotic Resistant Organisms (NDARO). The NDARO aims to increase standardization and make AMR-related data more widely available by collecting the genetic and antibiotic susceptibility data. Another database that can be used to evaluate potential treatments is the Sing group’s AMCN. The AMCN is a novel database system allowing users to see information on the efficacies and concentrations of certain drug combinations against a particular bacteria or AMR gene [51].
Through bioinformatics, pipelines can be created to streamline the output of EPI2ME to pathogen and antimicrobial databases that can help inform clinical point-of-care use. This automated pipeline can help inform clinicians of possible antibiotic treatments and can be used in conjunction with other approaches to help improve approaches. As an example of this use, we can take a look at a common case of a patient with cellulitis. If WIMP identifies Staphylococcus aureus as a pathogen present, PATRIC suggests 17 possible antibiotics that are used to treat the infection caused by S. aureus. If ARMA detects that mecA, a common AMR gene of S. aureus is present, the automated pipeline can denote methicillin as a poor treatment option. Additionally, the user can utilize AMCN to deduce the most efficacious combination of potential drug treatment to optimize the recovery time, drug usage and the patient quality of life. In this instance, we take two drug candidates, Vancomycin and Linezolid, which were suggested from PATRIC. AMCN shows us the percent combinations of these drugs in combination with other drugs that can provide the most synergistic effect against S. aureus.
This information is very valuable in providing the most effective treatment option and for avoiding the use of broad-spectrum antibiotics.
Comparison to current approaches
Ultimately, the approach is designed to take less than 3 h from sample collection to AMR detection, bacterial ID and antibiotic treatment suggestion. Depending on the kit applied, microbiome DNA isolation will take approximately 1 h, with an additional hour for library preparation. The sequencing, bacterial ID, AMR detection and antibiotic information are all provided in real time. For the entire microbiome sequence, we estimate that it will likely take less than 1 h [104]. In terms of cost, the Flongle is listed at $1800 USD, and the VolTRAX V2 is priced at $8150 USD. Thus, total machinery costs are under $10 000 USD. The costs per sample vary depending on the kit used and are less than or equal to $150 USD.
In terms of resolution, this approach offers the highest resolution and theoretically the greatest accuracy so long as the microbiome reads are accurate and reliable. Regarding time and cost optimization, this approach has a faster turnaround time than all of the current fastest commercial techniques available for AST (identified in Table 1), with the exception of Biofire. However, the machinery cost of Biofire is considerably higher than a nanopore-based approach and is more suited to a centralized testing approach, with inherent longer turnaround times due to the logistics of sample movement. A key advantage to the proposed approach is the avoidance of centralized hospital machinery. Most hospitals and clinical laboratories usually have only system for bacterial ID and AST, for instance, hospitals and labs often share systems such as the MicroScan Walkaway and the MALDI-TOF MS. A downside to this type of design is that there is often a long waiting time on a patient-to-patient basis for the result. With a variety of cases depending on the machine, this type of approach increases the amount of time the infection has to occur. With the use of the handheld Flongle, this can be avoided with a Flongle for each case, and the subsequent analysis being performed on modern laptops. This point-of-care diagnosis approach is critical in terms of lessening turnaround times.
This approach can be recreated using other sequencing technologies and different bioinformatic downstream analysis programs. For instance, one could use the Illumina sequencing data and the previously mentioned LiveKraken analysis, a real-time read classification that uses streams of raw data from Illumina sequencers to classify reads taxonomically [57]. However, the ONT data can report greater taxonomic resolution than Illumina reads classification, making it more advantageous in this approach [104]. Another similar design can use PacBio data and sppIDer [111], a species classification tool designed for working with PacBio data, however, this combination is not intended to function in real time.
Some of the limitations of this approach include the degree of information available on databases such as PATRIC and NDARO. However, these databases are constantly growing with new AMR genes, bacterial species and antibiotic susceptibility information. Additionally, the Oxford Nanopore Sequencing technologies are relatively recent and will improve in terms of speed, accuracy and price as their popularity increases. Additionally, the performance of lysis methods may be subject to several factors including sensitivity of the target microorganism to the lysis [106]. Further research will be needed to optimize lysis reagents to improve the workflow. Furthermore, this approach is somewhat dependent on the level of background reads from the human host in comparison microbial reads [106]. This can have an impact on the sensitivity and time length of the method. Additionally, this type of approach to AMR detection is prone to false-negative results [106], however, this approach would be considered a frontline ID method that could deal with a significant number of cases. In the case of negative frontline sequence-based testing, existing, established practices should be used for ID. The deployment of sequencing in the clinical context requires it to be considered as part of larger systems, including validation and regulatory standards [106].
Conclusions
Avoiding the use of broad-spectrum antibiotics is critical in improving the therapies and for preventing the production of AMR bacteria. Generalizing and extending the reviewed examples, we propose a pipeline framework using ONT (Flongle, VolTRAX V2 and EPI2ME), third-party software and databases (assembly and AMR ID) and DNA extraction processes, which should aid in the exigent diagnosis and treatment of bacterial infections. This approach could improve on a variety of current diagnostic techniques in terms of turnaround time, cost and accuracy. The main improvements over NGS in microbial applications are the: (1) automation and clinical point-of-care potential use that this pipeline presents and (2) ability to analyze sequence data as they are produced (i.e. within seconds rather than hours/days).
While all the bioinformatics and hardware pieces of the clinical microbiology sequencing puzzle exist and have been enumerated here, we have also highlighted additional bioinformatics components that would aid in maximizing the utility of the approach: (1) systematically combining the results of multiple read-to-species classification tools to reduce false-positives and (2) automated and frequent consolidated updates to species and AMR gene ID databases.
Commonly used methods for clinical infection diagnostics rely on high capital cost equipment and require centralized processing, sample batching and/or skilled labor.
The emerging availability of automated sample preparation and low capital cost DNA sequencing equipment (nanopore-based) will enable diagnostic testing closer to point of care by combining OEM tools with third-party software and databases.
Sequencing-based clinical diagnostics have been demonstrated repeatedly at small scale by researchers.
Sequencing-based diagnosis leverages large (and ever-growing) public catalogs of both existing clinical isolates and AMR genes.
Quality control mechanisms established for whole microbial genome sequencing can provide indication of diagnostic quality to point-of-care sequencing users.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Conflict of interest
Dropen Sheka, Nikolay Alabi, and Paul M.K. Gordon declare that they have no conflict of interest.
Dropen Sheka is a data-driven and computational oncology researcher in the Department of Biochemistry and Molecular Biology in the Faculty of Medicine at the University of Calgary.
Nikolay Alabi is a data-driven and computational oncology researcher in the Department of Biochemistry and Molecular Biology in the Faculty of Medicine at the University of Calgary.
Dr Paul M. K. Gordon is the Bioinformatics Manager at the Cumming School of Medicine’s Centre for Health Genomics and Informatics, a core facility of the University of Calgary. He has been doing nanopore sequence analysis since 2014 and microbial bioinformatics generally since 1996.
References
Author notes
Dropen Sheka and Nikolay Alabi contributed equally to this work.