Abstract

Background

Oxford Nanopore Technology (ONT) third-generation sequencing (TGS) is a versatile genetic diagnostic platform. However, it is nonetheless challenging to prepare long-template libraries for long-read TGS, particularly the ONT method for analysis of hemoglobinopathy variants involving complex structures and occurring in GC-rich and/or homologous regions.

Methods

A multiplex long PCR was designed to prepare library templates, including the whole-gene amplicons for HBA2/1, HBG2/1, HBD, and HBB, as well as the allelic amplicons for targeted deletions and special structural variations. Library construction was performed using long-PCR products, and sequencing was conducted on an Oxford Nanopore MinION instrument. Genotypes were identified based on integrative genomics viewer (IGV) plots.

Results

This novel long-read TGS method distinguished all single nucleotide variants and structural variants within HBA2/1, HBG2/1, HBD, and HBB based on the whole-gene sequence reads. Targeted deletions and special structural variations were also identified according to the specific allelic reads. The result of 158 α-/β-thalassemia samples showed 100% concordance with previously known genotypes.

Conclusions

This ONT TGS method is high-throughput, which can be used for molecular screening and genetic diagnosis of hemoglobinopathies. The strategy of multiplex long PCR is an efficient strategy for library preparation, providing a practical reference for TGS assay development.

Introduction

Long-read third-generation sequencing (TGS) is a new revolution in sequencing and has paved the way to analyze complex structural changes in the genome (1, 2). Presently, there are 2 TGS platforms, Pacific Biosciences (PacBio) and Oxford Nanopore Technology (ONT). The PacBio TGS platform is based on a single-molecule, real-time sequencing strategy that detects the light signals when nucleotides are incorporated by DNA polymerase. The ONT TGS platform relies on electric current fluctuations caused by individual DNA or RNA molecules passing through biological pores. ONT TGS has the advantages of faster library preparation, real-time sequence data analysis, and longer reads for better alignment (3, 4). The ONT TGS is also a versatile platform with a series of instruments, including the pocket-size MinION and the ultra-high-throughput PromethION, and its sequencing flow cell can be used repeatedly, thereby providing cost savings (1, 5, 6).

Hemoglobinopathy is the most common monogenic disorder caused by defects of globin gene clusters, including abnormal hemoglobin and thalassemia (7–9). At present, there are more than 1860 human hemoglobin variants found in the globin gene (https://globin.bx.psu.edu/). The majority (approximately 1400) are abnormal hemoglobin variants, distributed worldwide, and only one-third are clinically relevant. There are around 530 thalassemia variants, prevalent in tropical/subtropical regions, and they affect about 350 million people (10, 11). Thalassemia is a hereditary anemia with considerable morbidity and mortality, and the prevention program for this disorder, including prenatal genetic diagnosis, has always been an important public health issue. Therefore, comprehensive analysis of hemoglobinopathy variants is essential in clinical practice, not only for the differential diagnosis of pathological/nonpathological abnormal hemoglobin and thalassemia but also for the genetic diagnosis of thalassemia and the identification of at-risk fetuses (11–13).

The spectrum of genetic variation in hemoglobinopathies is highly heterogeneous (12, 14, 15). Abnormal hemoglobin is generally caused by single nucleotide variants (SNVs) of the HBA, HBB, HBD, and HBG genes, resulting in structural changes to hemoglobin. Thalassemia mutations are particularly complex and diverse, including deletional/nondeletional (, , αTα) α-thalassemia mutations, SNVs of HBB (β), and β-thalassemia deletions. Moreover, there are also other special HBA structural variations (SVs), including the multicopy of HBA2(αααanti4.2), multicopy of HBA1(αααanti3.7), and fusion allele (HKαα).

Previously, gap-PCR and PCR reverse dot blot (PCR-RDB) methods were routinely used for detection of common α-/β-thalassemia deletions and point mutations. These methods are time consuming and provide information on a limited number of thalassemia variants (16–18). Subsequently, next-generation sequencing (NGS) has been considered as an effective approach for hemoglobinopathy variant screening, with the advantages of high throughput and comprehensive gene coverage (19–21). However, NGS is still limited due to short read technology whereby it is difficult to ascertain variants in the GC-rich, homologous globin genes. Recently, PacBio-based TGS methods have been developed for simultaneous screening of thalassemia variants, and they have demonstrated improved performance in allele coverage, reliability, and accuracy (22–24).

In this study, an ONT MinION was used and a novel TGS method was developed for comprehensive analysis of hemoglobinopathy variants. All SNVs/SVs within the sequences of HBA2/1, HBG2/1, HBD, and HBB genes could be detected, and targeted variants of 6 common α-thalassemia deletions (--THAI, --FIL, --SEA, -α27.6, -α4.2, -α3.7), 2 β-thalassemia deletions (SEA-HPFH, Gγ+(Aγδβ)0), 2 HBA multicopies (αααanti4.2, αααanti3.7), and a fusion allele (HKαα) were also identified.

Materials and Methods

Subjects and DNA Samples

A total of 158 genomic DNA (gDNA) samples were collected from the sample bank of the Medical Genetics Department in Southern Medical University (Guangzhou, China), including 19 normal samples and 139 hemoglobinopathy carriers of 24 globin gene variants. All subjects were pregenotyped with gap PCR and PCR-reverse dot blot kits for common α-/β-thalassemia variants, Sanger sequencing, or multiplex ligation-dependent probe amplification for other SNVs or SVs of globin genes (HBA, NC_000016.10. HBB, NC_000011.10. GRCh38.) Informed consent was obtained from all subjects before the samples were collected.

Methodological Strategy

The ONT-based TGS assay procedure included template preparation by multiplex long PCR, library construction by addition of barcodes and sequencing adapters (motor protein included), single molecule sequencing on an ONT TGS instrument, and data analysis with supporting software (Fig.1, A and additional details later). In principle, the critical step for the TGS assay is the preparation of sequencing templates by multiplex long PCR. Accordingly, a set of primers was designed to amplify targeted sequencing templates, including the allelic amplicons for α-/β-thalassemia deletions and special structure variations, as well as the whole-gene sequence amplicons of HBA2/1, HBG2/1, HBB, and HBD for SNVs and SVs within these genes (Fig.1, B and C and Table 1).

Workflow and amplicons of ONT TGS method for detection of hemoglobinopathy variants. (A), Workflow illustration of MinION-based TGS assay for hemoglobinopathy variant detection; (B), primer positions and amplicon ranges of multiplex long-PCR for target α-/β-globin genes. Functional genes are presented by red color letter. The arrows indicate the position and extension direction of primers; (C), target amplicon lengths of multiplex long-PCR and electrophoresis results in agarose gel. Chrom: chromosome.
Fig. 1.

Workflow and amplicons of ONT TGS method for detection of hemoglobinopathy variants. (A), Workflow illustration of MinION-based TGS assay for hemoglobinopathy variant detection; (B), primer positions and amplicon ranges of multiplex long-PCR for target α-/β-globin genes. Functional genes are presented by red color letter. The arrows indicate the position and extension direction of primers; (C), target amplicon lengths of multiplex long-PCR and electrophoresis results in agarose gel. Chrom: chromosome.

Table 1.

Primers and amplicons of multiplex long PCR.

NameSequence (5'-3')Position(hg38)Amplicons
HBA-FAGCTAGAGCATTGGTGGTCATGCCCChr16:169343-169367HBA2/1, 3.7, 4.2, αααanti3.7, αααanti4.2, HKαα, anti-HKαα
HBA-RGACTTCGCGGTGGCTCCACTTTCCChr16:177663-177686
HBD-FaATGCTCATGGGTGTGTATTTGTCTGCCChr11:5235371-5235400HBD, HBB
HBB-RACAGAGACAACTAAGGCTGAGTGGCCChr11:5225010-5225038
HBG-FbGCTTTGTTGCGCAGGTCAACATGTATCChr11:5256171-5256199HBG2/1
HBG-RGGGGCTGCTCTGCCTATGCAGTAGTCChr11:5246661-5246686
SEA-FcACGGAGCGATCTGGGCTCTGTGTTCChr16:165251-165275--SEA
SEA-RcCTAGCCCAGCCAACTGTCTCCAAGGChr16:187361-187385
TF-FdGGACTCCCAGCCCACACAAACCATCChr16:143710-143734--THAI, --FIL
TF-RCTTGGATCTGCACCTCTGGGTAGGTTCChr16:183339-183366
SH-RGGAGCTCTGGCCTCTTGAGATGTTCAChr11:5201321-5201349SEA-HPFH
DB-RCCCCAGAAATTGCCTCATGTCTCTCCCChr11:5169829-5169855Gγ+(Aγδβ)0
NameSequence (5'-3')Position(hg38)Amplicons
HBA-FAGCTAGAGCATTGGTGGTCATGCCCChr16:169343-169367HBA2/1, 3.7, 4.2, αααanti3.7, αααanti4.2, HKαα, anti-HKαα
HBA-RGACTTCGCGGTGGCTCCACTTTCCChr16:177663-177686
HBD-FaATGCTCATGGGTGTGTATTTGTCTGCCChr11:5235371-5235400HBD, HBB
HBB-RACAGAGACAACTAAGGCTGAGTGGCCChr11:5225010-5225038
HBG-FbGCTTTGTTGCGCAGGTCAACATGTATCChr11:5256171-5256199HBG2/1
HBG-RGGGGCTGCTCTGCCTATGCAGTAGTCChr11:5246661-5246686
SEA-FcACGGAGCGATCTGGGCTCTGTGTTCChr16:165251-165275--SEA
SEA-RcCTAGCCCAGCCAACTGTCTCCAAGGChr16:187361-187385
TF-FdGGACTCCCAGCCCACACAAACCATCChr16:143710-143734--THAI, --FIL
TF-RCTTGGATCTGCACCTCTGGGTAGGTTCChr16:183339-183366
SH-RGGAGCTCTGGCCTCTTGAGATGTTCAChr11:5201321-5201349SEA-HPFH
DB-RCCCCAGAAATTGCCTCATGTCTCTCCCChr11:5169829-5169855Gγ+(Aγδβ)0

HBD-F is the sharing forward primer of HBB-R and SH-R.

HBG-F is the sharing forward primer of HBG-R and DB-R.

The amplicon of SEA-F and SEA-R contains an alpha 3’HVR fragment (Fig. 1, B).

Primer pair of TF-F and HBA-R is designed for the -α27.6 deletion.

F, forward; R, reverse; Chr, chromosome.

Table 1.

Primers and amplicons of multiplex long PCR.

NameSequence (5'-3')Position(hg38)Amplicons
HBA-FAGCTAGAGCATTGGTGGTCATGCCCChr16:169343-169367HBA2/1, 3.7, 4.2, αααanti3.7, αααanti4.2, HKαα, anti-HKαα
HBA-RGACTTCGCGGTGGCTCCACTTTCCChr16:177663-177686
HBD-FaATGCTCATGGGTGTGTATTTGTCTGCCChr11:5235371-5235400HBD, HBB
HBB-RACAGAGACAACTAAGGCTGAGTGGCCChr11:5225010-5225038
HBG-FbGCTTTGTTGCGCAGGTCAACATGTATCChr11:5256171-5256199HBG2/1
HBG-RGGGGCTGCTCTGCCTATGCAGTAGTCChr11:5246661-5246686
SEA-FcACGGAGCGATCTGGGCTCTGTGTTCChr16:165251-165275--SEA
SEA-RcCTAGCCCAGCCAACTGTCTCCAAGGChr16:187361-187385
TF-FdGGACTCCCAGCCCACACAAACCATCChr16:143710-143734--THAI, --FIL
TF-RCTTGGATCTGCACCTCTGGGTAGGTTCChr16:183339-183366
SH-RGGAGCTCTGGCCTCTTGAGATGTTCAChr11:5201321-5201349SEA-HPFH
DB-RCCCCAGAAATTGCCTCATGTCTCTCCCChr11:5169829-5169855Gγ+(Aγδβ)0
NameSequence (5'-3')Position(hg38)Amplicons
HBA-FAGCTAGAGCATTGGTGGTCATGCCCChr16:169343-169367HBA2/1, 3.7, 4.2, αααanti3.7, αααanti4.2, HKαα, anti-HKαα
HBA-RGACTTCGCGGTGGCTCCACTTTCCChr16:177663-177686
HBD-FaATGCTCATGGGTGTGTATTTGTCTGCCChr11:5235371-5235400HBD, HBB
HBB-RACAGAGACAACTAAGGCTGAGTGGCCChr11:5225010-5225038
HBG-FbGCTTTGTTGCGCAGGTCAACATGTATCChr11:5256171-5256199HBG2/1
HBG-RGGGGCTGCTCTGCCTATGCAGTAGTCChr11:5246661-5246686
SEA-FcACGGAGCGATCTGGGCTCTGTGTTCChr16:165251-165275--SEA
SEA-RcCTAGCCCAGCCAACTGTCTCCAAGGChr16:187361-187385
TF-FdGGACTCCCAGCCCACACAAACCATCChr16:143710-143734--THAI, --FIL
TF-RCTTGGATCTGCACCTCTGGGTAGGTTCChr16:183339-183366
SH-RGGAGCTCTGGCCTCTTGAGATGTTCAChr11:5201321-5201349SEA-HPFH
DB-RCCCCAGAAATTGCCTCATGTCTCTCCCChr11:5169829-5169855Gγ+(Aγδβ)0

HBD-F is the sharing forward primer of HBB-R and SH-R.

HBG-F is the sharing forward primer of HBG-R and DB-R.

The amplicon of SEA-F and SEA-R contains an alpha 3’HVR fragment (Fig. 1, B).

Primer pair of TF-F and HBA-R is designed for the -α27.6 deletion.

F, forward; R, reverse; Chr, chromosome.

Multiplex Long PCR

For the optimized multiplex long PCR, 50 μL of solution was prepared, containing 50 to 80 ng gDNA, 0.4 mol/L Betaine (Sigma-Aldrich), 4.0% DMSO (Sigma-Aldrich), 0.35 mmol/L dNTPs, 0.2 μmol/L PCR primer mix, and 1.5 u LA Taq polymerase with 1×LA PCR Buffer (Takara Bio Co. Ltd.). The optimized PCR conditions were as follows: an initial denaturation at 95°C for 10 min, 30 cycles of denaturation at 95°C for 1 min, and extension at 68°C for 13 min, followed by a repairing process at 72°C for 30 min and finally hold at 12°C. Subsequently, the PCR products were purified with 1.0 volume Ampure XP beads (Beckman Coulter) and diluted to 20 ng/μL.

Library Construction

Library construction was performed using the SQK-LSK109 kit (ONT). First, NEBNext® Ultra™ II End-Repair/dA-Tailing Module (NEB) was used for the end-repair of 400 ng purified PCR product, according to the kit manual. After purification with 1.0 volume Ampure XP beads and elution with 13 μL nuclease-free water, each sample was barcode ligated with the A-tailed DNA via the NEBNext® Ultra™ II Ligation Module (NEB). Thereafter, the barcode ligated products were purified by 1.0 volume Ampure XP beads, eluted with 15 μL nuclease-free water, and diluted to 10 ng/μL.

Next, the barcode ligated products of several samples intended to be sequenced within the same chip were mixed with equal quality, and then adapter ligation was performed using the NEBNext® Ultra™ II Quick Ligation Module (NEB). After purification with 0.8 volume Ampure XP beads (Beckman) and washing with 2 volumes of L Fragment Buffer (ONT) to remove nonligated ingredients, the adapter ligated products were eluted with 14 μL elution buffer. Finally, the enriched long-length templates were diluted to 25 ng/μL and used as the sequencing library.

ONT-Based Sequencing and Base Calling

Using the Flow Cell Priming Kit (ONT), 100 ng sequencing library was loaded into the MinION flow cell (ONT), according to the manufacturer’s instructions. Afterwards, the MinION Mk1B device was connected to the MinKNOW software v.2.0, the number of active pores were confirmed, sequencing was initiated, and the process was completed within 5 h. The base callings were obtained from the sequencing reads based on raw current measurements.

Bioinformatic Analysis of TGS Data

Based on the standard of sequencing depth >200× and Phred quality score ≥ 7 (Q score ≥ 7), the raw sequencing data from the TGS MinION Mk1B instrument was filtered and analyzed by bioinformatics. At first, a BAM file was generated by aligning the raw reads to the human genome database (hg38) using Minimap2 version 2.17-r941 (25) and then sorted and indexed using SAMtools version 1.7 (26, 27). Finally, the sequences of target templates were revealed, and then the SNVs, SVs, or deletions/insertions of globin genes were identified using Deepvariant version 1.1.0 (28). The alignments of variant and reference alleles were then shown in Integrative Genomics Viewer (IGV) version 2.12.3 (Figs. 2 and 3).

IGV plots of base-callings aligned with reference sequences of human genome (hg38). The normal and variant alleles are shown at the up and down of each diagram. The blue arrows indicate the position and extension direction of primers. The complete gray area indicates the target sequence without deletion. The blank area in the middle of the target sequence indicates a large deletion (A, B, C, D, E, F). The recombinant structural variants of αααanti3.7or αααanti4.2 show a 3.7 kb or 4.2 kb duplication in target HBA1/2 sequence (G, H). The structural variant of HKαα shows a 4.2 kb insertion and a 3.7 kb deletion simultaneously in this variant allele (I).
Fig. 2.

IGV plots of base-callings aligned with reference sequences of human genome (hg38). The normal and variant alleles are shown at the up and down of each diagram. The blue arrows indicate the position and extension direction of primers. The complete gray area indicates the target sequence without deletion. The blank area in the middle of the target sequence indicates a large deletion (A, B, C, D, E, F). The recombinant structural variants of αααanti3.7or αααanti4.2 show a 3.7 kb or 4.2 kb duplication in target HBA1/2 sequence (G, H). The structural variant of HKαα shows a 4.2 kb insertion and a 3.7 kb deletion simultaneously in this variant allele (I).

IGV plots of SNVs. The reference sequences of human genome (hg38) are displayed at the bottom of each diagram, with the positive-sense strand of HBA1/2 and anti-sense strand of HBB. The normal nucleotides of each read are shown as gray bars. ▾, variation position. (A), The SNVs of HBA2; (B), the SNVs of HBB.
Fig. 3.

IGV plots of SNVs. The reference sequences of human genome (hg38) are displayed at the bottom of each diagram, with the positive-sense strand of HBA1/2 and anti-sense strand of HBB. The normal nucleotides of each read are shown as gray bars. ▾, variation position. (A), The SNVs of HBA2; (B), the SNVs of HBB.

Results

Multiplex Long PCR for Libary Preparation

As shown in Fig. 1 and Table 1, a multiplex long PCR was established to amplify target genes and alleles, including the primer pair of HBA-F/HBA-R for HBA2/HBA1 and structural variants involving HBA2/HBA1(3.7, 4.2, ααααanti3.7, αααanti4.2, HKαα, anti-HKαα), HBD-F/HBB-R for HBB and HBD, HBG-F/HBG-R for HBG2/HBG1, SEA-F/SEA-R for --SEA deletion, TF-F/TF-R for --THAI and --FIL deletions, TF-F/HBA-R for 27.6 deletion, HBD-F/SH-R for SEA-HPFH deletion, and HBG-F/DB-R for Chinese Gγ+(Aγδβ)0 deletion. In summary, there were 3 whole-gene amplicons (HBA2/1, HBG2/1, and HBB/HBD) used as the TGS templates for the detection of SNVs and SVs within these target sequences. There were 8 other allelic amplicons for deletions and 2 allelic amplicons for HBA multicopies, ranging from approximately 5.0 to 12.5 kb.

Minion-Based TGS Data Outcomes

Using the MinION Mk1B and FLO-MIN106D R9.4 flow cell, about 10 Gb sequencing data was generated within 5 h of sequencing time, which was sufficient for analysis of 24 tested samples (approximately 400 MB per sample). The TGS data also was high yield and high quality, with an average pass ratio, average map ratio, and target map ratio of 96.94%, 96.73%, and 86.87%, respectively. Typically, the mean read depths for the 3 main target regions of HBA, HBB, and HBG were 3932×, 3452×, and 4020×, respectively (see Supplemental Table 1, Supplemental Table 2 in the online Data Supplement).

Sequencing Data Alignment and Allele Identification

The ONT sequencing data was aligned to the target reference sequences of α/β-globin gene cluster, and then the variant alleles were identified using IGV plots (Figs. 2 and 3). In this study, a total of 24 globin gene variants were detected in clinical gDNA samples using this novel ONT TGS method, including 4 HBA deletions (--THAI, --SEA, 3.7, 4.2), 2 HBB deletions (SEA-HPFH, Gγ+(Aγδβ)0), 2 HBA multicopies (αααanti3.7, αααanti4.2), 1 HBA fusion allele (HKαα), 3 nondeletional α-thalassemia variants(WS, HBA2:c.369C > G; QS, HBA2:c.377T > C; CS, HBA2:c.427T > C), and 12 β-thalassemia variants (HBB:c.-82C > A, HBB:c.-79A > G, HBB:c.-78A > G, HBB:c.2T > G, HBB:c.45dupG, HBB:c.52A > T, HBB:c.79G > A, HBB:c.85dupC, HBB:c.92 + 1G > T, HBB:c.126_129delCTTT, HBB:c.217dupA, HBB:c.316-197C > T). As shown in Fig. 2, IGV plots showed the deleted, duplicated, or fused sequence range, and then base calling allowed for accurate identification of these variants. As shown in Fig. 3, the normal and variant alleles were also accurately distinguished, based on the IGV plots of base-calling data aligned with target reference sequences.

Validation of the ONT TGS Method

To evaluate the sensitivity and accuracy, a total of 158 blinded thalassemia samples were analyzed with the ONT TGS method (Table 2). The evaluation result showed a 100% concordance with the predetermined genotypes. Meanwhile, 50 thalassemia samples were analyzed in duplicate to ascertain reproducibility of this method. The genotype results were 100% concordant in the 2 test runs.

Table 2.

158 Pregenotyped α-/β-thalassemia samples identified by the ONT-based TGS method.

GenotypeONT-based TGSn
Detected variantGenotype
Normal
αα/αα, βNNαα/αα, βNN19
HBA deletion
--SEA/αα, βNN--SEA--SEA/αα, βNN20
−α3.7/αα, βNN−α3.7−α3.7/αα, βNN21
−α4.2/αα, βNN−α4.2−α4.2/αα, βNN14
--THAI/αα, βNN--THAI--THAI/αα, βNN9
−α3.7/-α4.2, βNN−α3.7, -α4.2−α3.7/-α4.2, βNN4
−α3.7/-α3.7, βNN−α3.7, -α3.7−α3.7/-α3.7, βNN2
−α4.2/-α4.2, βNN−α4.2, -α4.2−α4.2/-α4.2, βNN1
--SEA/-α3.7, βNN--SEA, -α3.7--SEA/-α3.7, βNN4
--SEA/-α4.2, βNN--SEA, -α4.2--SEA/-α4.2, βNN1
HBA SNV
αWSα/αα, βNNHBA2:c.369C > GαWSα/αα, βNN3
αQSα/αα, βNNHBA2:c.377T > CαQSα/αα, βNN5
αCSα/αα, βNNHBA2:c.427T > CαCSα/αα, βNN3
HBA SV
αααanti3.7/αα, βNNαααanti3.7αααanti3.7/αα, βNN3
αααanti4.2/αα, βNNαααanti4.2αααanti4.2/αα, βNN3
HKαα/αα, βNNHKααHKαα/αα, βNN3
HBB deletion
αα/αα, βSEA-HPFHNSEA-HPFHαα/αα, βSEA-HPFHN4
αα/αα, βGγ+(Aγδβ)0NGγ+(Aγδβ)0αα/αα, βGγ+(Aγδβ)0N3
HBB SNV
αα/αα, β-32NHBB:c.-82C > Aαα/αα, β-32N1
αα/αα, β-29NHBB:c.-79A > Gαα/αα, β-29N1
αα/αα, β-28NHBB:c.-78A > Gαα/αα, β-28N1
αα/αα, βIntNHBB:c.2T > Gαα/αα, βIntN2
αα/αα, βCD14–15NHBB:c.45dupGαα/αα, βCD14–15N1
αα/αα, βCD17NHBB:c.52A > Tαα/αα, βCD17N3
αα/αα, βCD26NHBB:c.79G > Aαα/αα, βCD26N4
αα/αα, βCD27–28NHBB:c.85dupCαα/αα, βCD27–28N3
αα/αα, βIVS-I-1NHBB:c.92 + 1G > Tαα/αα, βIVS-I-1N1
αα/αα, βCD41–42NHBB:c.126_129delCTTTαα/αα, βCD41–42N5
αα/αα, βCD71–72NHBB:c.217dupAαα/αα, βCD71–72N2
αα/αα, βIVS-II-654NHBB:c.316-197C > Tαα/αα, βIVS-II-654N5
Variant compound
−α3.7/αα, β-29N−α3.7, HBB:c.-79A > G−α3.7/αα, β-29N1
−α3.7/αα, βCD17N−α3.7, HBB:c.52A > T−α3.7/αα, βCD17N2
−α3.7/αα, βCD41–42N−α3.7, HBB:c.126_129delCTTT−α3.7/αα, βCD41–42N1
−α3.7/αα, βIVS-II-654N−α3.7, HBB:c.316-197C > T−α3.7/αα, βIVS-II-654N1
--SEA/αα, βCD41–42N--SEA, HBB:c.126_129delCTTT--SEA/αα, βCD41–42N1
--SEA/HKαα, βNN--SEA, HKαα--SEA/HKαα, βNN1
Total158
GenotypeONT-based TGSn
Detected variantGenotype
Normal
αα/αα, βNNαα/αα, βNN19
HBA deletion
--SEA/αα, βNN--SEA--SEA/αα, βNN20
−α3.7/αα, βNN−α3.7−α3.7/αα, βNN21
−α4.2/αα, βNN−α4.2−α4.2/αα, βNN14
--THAI/αα, βNN--THAI--THAI/αα, βNN9
−α3.7/-α4.2, βNN−α3.7, -α4.2−α3.7/-α4.2, βNN4
−α3.7/-α3.7, βNN−α3.7, -α3.7−α3.7/-α3.7, βNN2
−α4.2/-α4.2, βNN−α4.2, -α4.2−α4.2/-α4.2, βNN1
--SEA/-α3.7, βNN--SEA, -α3.7--SEA/-α3.7, βNN4
--SEA/-α4.2, βNN--SEA, -α4.2--SEA/-α4.2, βNN1
HBA SNV
αWSα/αα, βNNHBA2:c.369C > GαWSα/αα, βNN3
αQSα/αα, βNNHBA2:c.377T > CαQSα/αα, βNN5
αCSα/αα, βNNHBA2:c.427T > CαCSα/αα, βNN3
HBA SV
αααanti3.7/αα, βNNαααanti3.7αααanti3.7/αα, βNN3
αααanti4.2/αα, βNNαααanti4.2αααanti4.2/αα, βNN3
HKαα/αα, βNNHKααHKαα/αα, βNN3
HBB deletion
αα/αα, βSEA-HPFHNSEA-HPFHαα/αα, βSEA-HPFHN4
αα/αα, βGγ+(Aγδβ)0NGγ+(Aγδβ)0αα/αα, βGγ+(Aγδβ)0N3
HBB SNV
αα/αα, β-32NHBB:c.-82C > Aαα/αα, β-32N1
αα/αα, β-29NHBB:c.-79A > Gαα/αα, β-29N1
αα/αα, β-28NHBB:c.-78A > Gαα/αα, β-28N1
αα/αα, βIntNHBB:c.2T > Gαα/αα, βIntN2
αα/αα, βCD14–15NHBB:c.45dupGαα/αα, βCD14–15N1
αα/αα, βCD17NHBB:c.52A > Tαα/αα, βCD17N3
αα/αα, βCD26NHBB:c.79G > Aαα/αα, βCD26N4
αα/αα, βCD27–28NHBB:c.85dupCαα/αα, βCD27–28N3
αα/αα, βIVS-I-1NHBB:c.92 + 1G > Tαα/αα, βIVS-I-1N1
αα/αα, βCD41–42NHBB:c.126_129delCTTTαα/αα, βCD41–42N5
αα/αα, βCD71–72NHBB:c.217dupAαα/αα, βCD71–72N2
αα/αα, βIVS-II-654NHBB:c.316-197C > Tαα/αα, βIVS-II-654N5
Variant compound
−α3.7/αα, β-29N−α3.7, HBB:c.-79A > G−α3.7/αα, β-29N1
−α3.7/αα, βCD17N−α3.7, HBB:c.52A > T−α3.7/αα, βCD17N2
−α3.7/αα, βCD41–42N−α3.7, HBB:c.126_129delCTTT−α3.7/αα, βCD41–42N1
−α3.7/αα, βIVS-II-654N−α3.7, HBB:c.316-197C > T−α3.7/αα, βIVS-II-654N1
--SEA/αα, βCD41–42N--SEA, HBB:c.126_129delCTTT--SEA/αα, βCD41–42N1
--SEA/HKαα, βNN--SEA, HKαα--SEA/HKαα, βNN1
Total158
Table 2.

158 Pregenotyped α-/β-thalassemia samples identified by the ONT-based TGS method.

GenotypeONT-based TGSn
Detected variantGenotype
Normal
αα/αα, βNNαα/αα, βNN19
HBA deletion
--SEA/αα, βNN--SEA--SEA/αα, βNN20
−α3.7/αα, βNN−α3.7−α3.7/αα, βNN21
−α4.2/αα, βNN−α4.2−α4.2/αα, βNN14
--THAI/αα, βNN--THAI--THAI/αα, βNN9
−α3.7/-α4.2, βNN−α3.7, -α4.2−α3.7/-α4.2, βNN4
−α3.7/-α3.7, βNN−α3.7, -α3.7−α3.7/-α3.7, βNN2
−α4.2/-α4.2, βNN−α4.2, -α4.2−α4.2/-α4.2, βNN1
--SEA/-α3.7, βNN--SEA, -α3.7--SEA/-α3.7, βNN4
--SEA/-α4.2, βNN--SEA, -α4.2--SEA/-α4.2, βNN1
HBA SNV
αWSα/αα, βNNHBA2:c.369C > GαWSα/αα, βNN3
αQSα/αα, βNNHBA2:c.377T > CαQSα/αα, βNN5
αCSα/αα, βNNHBA2:c.427T > CαCSα/αα, βNN3
HBA SV
αααanti3.7/αα, βNNαααanti3.7αααanti3.7/αα, βNN3
αααanti4.2/αα, βNNαααanti4.2αααanti4.2/αα, βNN3
HKαα/αα, βNNHKααHKαα/αα, βNN3
HBB deletion
αα/αα, βSEA-HPFHNSEA-HPFHαα/αα, βSEA-HPFHN4
αα/αα, βGγ+(Aγδβ)0NGγ+(Aγδβ)0αα/αα, βGγ+(Aγδβ)0N3
HBB SNV
αα/αα, β-32NHBB:c.-82C > Aαα/αα, β-32N1
αα/αα, β-29NHBB:c.-79A > Gαα/αα, β-29N1
αα/αα, β-28NHBB:c.-78A > Gαα/αα, β-28N1
αα/αα, βIntNHBB:c.2T > Gαα/αα, βIntN2
αα/αα, βCD14–15NHBB:c.45dupGαα/αα, βCD14–15N1
αα/αα, βCD17NHBB:c.52A > Tαα/αα, βCD17N3
αα/αα, βCD26NHBB:c.79G > Aαα/αα, βCD26N4
αα/αα, βCD27–28NHBB:c.85dupCαα/αα, βCD27–28N3
αα/αα, βIVS-I-1NHBB:c.92 + 1G > Tαα/αα, βIVS-I-1N1
αα/αα, βCD41–42NHBB:c.126_129delCTTTαα/αα, βCD41–42N5
αα/αα, βCD71–72NHBB:c.217dupAαα/αα, βCD71–72N2
αα/αα, βIVS-II-654NHBB:c.316-197C > Tαα/αα, βIVS-II-654N5
Variant compound
−α3.7/αα, β-29N−α3.7, HBB:c.-79A > G−α3.7/αα, β-29N1
−α3.7/αα, βCD17N−α3.7, HBB:c.52A > T−α3.7/αα, βCD17N2
−α3.7/αα, βCD41–42N−α3.7, HBB:c.126_129delCTTT−α3.7/αα, βCD41–42N1
−α3.7/αα, βIVS-II-654N−α3.7, HBB:c.316-197C > T−α3.7/αα, βIVS-II-654N1
--SEA/αα, βCD41–42N--SEA, HBB:c.126_129delCTTT--SEA/αα, βCD41–42N1
--SEA/HKαα, βNN--SEA, HKαα--SEA/HKαα, βNN1
Total158
GenotypeONT-based TGSn
Detected variantGenotype
Normal
αα/αα, βNNαα/αα, βNN19
HBA deletion
--SEA/αα, βNN--SEA--SEA/αα, βNN20
−α3.7/αα, βNN−α3.7−α3.7/αα, βNN21
−α4.2/αα, βNN−α4.2−α4.2/αα, βNN14
--THAI/αα, βNN--THAI--THAI/αα, βNN9
−α3.7/-α4.2, βNN−α3.7, -α4.2−α3.7/-α4.2, βNN4
−α3.7/-α3.7, βNN−α3.7, -α3.7−α3.7/-α3.7, βNN2
−α4.2/-α4.2, βNN−α4.2, -α4.2−α4.2/-α4.2, βNN1
--SEA/-α3.7, βNN--SEA, -α3.7--SEA/-α3.7, βNN4
--SEA/-α4.2, βNN--SEA, -α4.2--SEA/-α4.2, βNN1
HBA SNV
αWSα/αα, βNNHBA2:c.369C > GαWSα/αα, βNN3
αQSα/αα, βNNHBA2:c.377T > CαQSα/αα, βNN5
αCSα/αα, βNNHBA2:c.427T > CαCSα/αα, βNN3
HBA SV
αααanti3.7/αα, βNNαααanti3.7αααanti3.7/αα, βNN3
αααanti4.2/αα, βNNαααanti4.2αααanti4.2/αα, βNN3
HKαα/αα, βNNHKααHKαα/αα, βNN3
HBB deletion
αα/αα, βSEA-HPFHNSEA-HPFHαα/αα, βSEA-HPFHN4
αα/αα, βGγ+(Aγδβ)0NGγ+(Aγδβ)0αα/αα, βGγ+(Aγδβ)0N3
HBB SNV
αα/αα, β-32NHBB:c.-82C > Aαα/αα, β-32N1
αα/αα, β-29NHBB:c.-79A > Gαα/αα, β-29N1
αα/αα, β-28NHBB:c.-78A > Gαα/αα, β-28N1
αα/αα, βIntNHBB:c.2T > Gαα/αα, βIntN2
αα/αα, βCD14–15NHBB:c.45dupGαα/αα, βCD14–15N1
αα/αα, βCD17NHBB:c.52A > Tαα/αα, βCD17N3
αα/αα, βCD26NHBB:c.79G > Aαα/αα, βCD26N4
αα/αα, βCD27–28NHBB:c.85dupCαα/αα, βCD27–28N3
αα/αα, βIVS-I-1NHBB:c.92 + 1G > Tαα/αα, βIVS-I-1N1
αα/αα, βCD41–42NHBB:c.126_129delCTTTαα/αα, βCD41–42N5
αα/αα, βCD71–72NHBB:c.217dupAαα/αα, βCD71–72N2
αα/αα, βIVS-II-654NHBB:c.316-197C > Tαα/αα, βIVS-II-654N5
Variant compound
−α3.7/αα, β-29N−α3.7, HBB:c.-79A > G−α3.7/αα, β-29N1
−α3.7/αα, βCD17N−α3.7, HBB:c.52A > T−α3.7/αα, βCD17N2
−α3.7/αα, βCD41–42N−α3.7, HBB:c.126_129delCTTT−α3.7/αα, βCD41–42N1
−α3.7/αα, βIVS-II-654N−α3.7, HBB:c.316-197C > T−α3.7/αα, βIVS-II-654N1
--SEA/αα, βCD41–42N--SEA, HBB:c.126_129delCTTT--SEA/αα, βCD41–42N1
--SEA/HKαα, βNN--SEA, HKαα--SEA/HKαα, βNN1
Total158

Discussion

A novel TGS method, based on the ONT MinION platform, for genetic testing of hemoglobinopathy variants was developed in this study. The results from 158 samples showed that all target variants were correctly identified by this method. The turnaround time of this method is approximately 20 h, including 9 h for gDNA extraction and multiplex long PCR, 3 h for library construction, and 8 h for sequencing and data analysis. Each run of a flow cell can sequence 24 tested samples, and more samples can be analyzed by repeating sequencing runs on a flow cell (5 h), with the same process of presequencing preparation (12 h). Additionally, the amount of ONT sequencing data is accumulated directly proportionally to sequencing time, so the time of each sequencing run can be adjusted according to the number of samples loaded and sequencing depth required. For example, there was about 10 Gb data generated within 5 h in this study, thus additional sequence time could obtain more sequencing data if deeper depth of coverage is required or more sample is loaded. Overall, these results demonstrated that the ONT TGS is a practical and versatile platform with rapid detection, which has the potential to be widely used in molecular detection and genetic diagnosis.

Library preparation is the most critical prerequisite for NGS and TGS, providing the templates for subsequent sequencing (29). TGS utilizes long-read sequencing, thus the library preparation involves the production of long-length templates. Long PCR has been described as a practical and efficient strategy to prepare long-length templates for PacBio- and ONT-based TGS assays (22, 30). In general, several genes and alleles are covered in a TGS method. Therefore, the design of multiplex long PCR to efficiently amplify long-length templates is the most essential issue in TGS method development. Particularly in this study, due to the high heterogeneity of hemoglobinopathy variants and the GC-rich and complex structure of globin gene sequences, it was a greater challenge to comprehensively design amplicons for all target variants in a single-tube multiplex long-PCR system. Because of longer amplicons and GC-rich sequences, the reaction system needed to be comprehensively optimized to ensure efficient amplification of all amplicons and to overcome amplification bias among multiple amplicons. Additionally, because shorter DNA molecules pass through the biological pores more easily and require less time, multiple amplicons must be similar in length to avoid sequencing read bias in the data (i.e., shorter templates have more reads whereas longer templates have fewer reads).

In principle, SNVs and SVs within target genes can be detected with full gene sequencing. Accordingly, 3 long amplicons were specifically designed for whole-gene templates of HBA2/1, HBD/HBB, and HBG2/1 (Fig. 1, B), ranging from 8.3 kb to 10.3 kb (Fig.1, C). Usually, the detection of deletions is based on short amplicons of specific truncated alleles in NGS or gap-PCR methods. However, long amplicons are required in the detection of deletions by the TGS method, in order to diminish read data bias and balance amplification efficiencies by matching the length of SNV and SV amplicons (Fig. 1).

Because ONT TGS involves long-read single-molecule sequencing, base calling can be achieved without assembly. Moreover, the library sequencing templates are specifically designed PCR amplicons for the target genes or alleles. Therefore, bioinformatics analysis for ONT TGS is relatively simple and involves alignment of complete single-molecule reads to the reference sequences of target genes or alleles, after filtering out the unqualified and fragmentary reads. In practice, there was a certain error rate in ONT sequencing reads at the SNV level. However, the base error at a SNV position was low and did not significantly alter the normal/variant allele ratio, so that the genotype could be accurately identified, as shown in Fig. 3. Generally, the base errors do not affect the analysis of SVs, such as shown in Fig. 2 (the purple points within each line of reads are base errors).

In summary, this novel ONT TGS method was versatile with simple bioinformatics analysis, providing a comprehensive and efficient approach for hemoglobinopathy variants. Using this method, all SNVs/SVs within HBA2/1, HBG2/1, HBB, and HBD sequences can be detected based on the whole-gene templates. Meanwhile, target deletions and special SVs can be also identified according to the existence of specific allelic templates. Moreover, the strategy of multiplex long PCR for library preparation provides a practical reference in TGS assay development, especially for the sequences with high GC content and genetic disorders with complex variants.

Supplemental Material

Supplemental material is available at Clinical Chemistry online.

Nonstandard Abbreviations

TGS, third-generation sequencing; PacBio, Pacific Biosciences; ONT, Oxford Nanopore Technology; SNV, single nucleotide variants; SV, structural variation; NGS, next-generation sequencing; gDNA, genomic DNA; IGV, Integrative Genomics Viewer.

Human Genes

HBA, hemoglobin subunit alpha; HBA1, hemoglobin subunit alpha 1; HBA2, hemoglobin subunit alpha 2; HBB, hemoglobin subunit beta; HBD, hemoglobin subunit delta; HBG, hemoglobin subunit gamma.

Author Contributions

The corresponding author takes full responsibility that all authors on this publication have met the following required criteria of eligibility for authorship: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; (c) final approval of the published article; and (d) agreement to be accountable for all aspects of the article thus ensuring that questions related to the accuracy or integrity of any part of the article are appropriately investigated and resolved. Nobody who qualifies for authorship has been omitted from the list.

Weilun Huang (Methodology-Lead, Writing—original draft-Equal), Shoufang Qu (Conceptualization-Equal, Project administration-Equal, Validation-Equal), Qiongzhen Qin (Investigation-Equal, Validation-Equal), Xu Yang (Data curation-Equal, Validation-Equal), Wanqing Han (Formal analysis-Equal, Software-Equal), Yongli Lai (Investigation-Equal, Resources-Equal), Jiaqi Chen (Funding acquisition-Equal, Resources-Equal), Shihao Zhou (Methodology-Equal, Validation-Equal), Xuexi Yang (Methodology-Equal, Project administration-Equal, Supervision-Equal), and Wanjun Zhou (Funding acquisition-Supporting, Methodology-Lead, Supervision-Lead, Writing—review & editing-Lead)

Authors’ Disclosures or Potential Conflicts of Interest

Upon manuscript submission, all authors completed the author disclosure form. Disclosures and/or potential conflicts of interest:

Employment or Leadership

None declared.

Consultant or Advisory Role

None declared.

Stock Ownership

None declared.

Honoraria

None declared.

Research Funding

This study was supported by the National Natural Science Foundation of China (Grant No. 81972008), Basic and Applied Basic Research Foundation of Guangdong Province (Grant No. 2023A1515011098 and 2021A1515220091).

Expert Testimony

None declared.

Patents

None declared.

Role of Sponsor

The funding organizations played a direct role in the design of study, review and interpretation of data, and preparation of manuscript. The funding organizations played no role in the choice of enrolled patients or final approval of manuscript.

References

1

Ashikawa
S
,
Tarumoto
N
,
Imai
K
,
Sakai
J
,
Kodana
M
,
Kawamura
T
, et al.
Rapid identification of pathogens from positive blood culture bottles with the MinION nanopore sequencer
.
J Med Microbiol
2018
;
67
:
1589
95
.

2

Zhao
N
,
Cao
J
,
Xu
J
,
Liu
B
,
Liu
B
,
Chen
D
, et al.
Targeting RNA with next- and third-generation sequencing improves pathogen identification in clinical samples
.
Adv Sci
2021
;
8
:
e2102593
.

3

Yu
S
,
Deng
J
,
Qiao
R
,
Cheng
SH
,
Peng
W
,
Lau
SL
, et al.
Comparison of single molecule, real-time sequencing and nanopore sequencing for analysis of the size, end-motif, and tissue-of-origin of long cell-free DNA in plasma
.
Clin Chem
2023
;
69
:
168
79
.

4

Tounsi
WA
,
Lenis
VP
,
Tammi
SM
,
Sainio
S
,
Haimila
K
,
Avent
ND
,
Madgett
TE
.
Rh blood group D antigen genotyping using a portable nanopore-based sequencing device: proof of principle
.
Clin Chem
2022
;
68
:
1196
201
.

5

Miga
KH
,
Koren
S
,
Rhie
A
,
Vollger
MR
,
Gershman
A
,
Bzikadze
A
, et al.
Telomere-to-telomere assembly of a complete human X chromosome
.
Nature
2020
;
585
:
79
84
.

6

Sanderson
ND
,
Street
TL
,
Foster
D
,
Swann
J
,
Atkins
BL
,
Brent
AJ
, et al.
Real-time analysis of nanopore-based metagenomic sequencing from infected orthopaedic devices
.
BMC Genomics
2018
;
19
:
714
.

7

Taher
AT
,
Musallam
KM
,
Cappellini
MD
.
beta-Thalassemias
.
New Engl J Med
2021
;
384
:
727
43
.

8

Manthei
DM
,
Li
SH
,
Keren
DF
,
Gherasim
C
.
Prenatal hemoglobinopathy evaluation
.
Clin Chem
2021
;
67
:
1293
4
.

9

Pinto
VM
,
Musallam
KM
,
Derchi
G
,
Graziadei
G
,
Giuditta
M
,
Origa
R
, et al.
Mortality in beta-thalassemia patients with confirmed pulmonary arterial hypertension on right heart catheterization
.
Blood
2022
;
139
:
2080
3
.

10

Uyoga
S
,
Watson
JA
,
Wanjiku
P
,
Rop
JC
,
Makale
J
,
Macharia
AW
, et al.
The impact of malaria-protective red blood cell polymorphisms on parasite biomass in children with severe plasmodium falciparum malaria
.
Nat Commun
2022
;
13
:
3307
.

11

Locke
M
,
Reddy
PS
,
Badawy
SM
.
Adherence to iron chelation therapy among adults with thalassemia: a systematic review
.
Hemoglobin
2022
;
46
:
201
13
.

12

de Winter
DP
,
Kaminski
A
,
Tjoa
ML
,
Oepkes
D
.
Hemolytic disease of the fetus and newborn: systematic literature review of the antenatal landscape
.
BMC Pregnancy Childb
2023
;
23
:
12
.

13

Zha
G
,
Xiao
X
,
Tian
Y
,
Zhu
H
,
Chen
P
,
Zhang
Q
, et al.
An efficient isoelectric focusing of microcolumn array chip for screening of adult beta-thalassemia
.
Clin Chim Acta
2023
;
538
:
124
30
.

14

Songdej
D
,
Kadegasem
P
,
Tangbubpha
N
,
Sasanakul
W
,
Deelertthaweesap
B
,
Chuansumrit
A
,
Sirachainan
N
.
Whole-exome sequencing uncovered genetic diagnosis of severe inherited haemolytic anaemia: correlation with clinical phenotypes
.
Brit J Haematol
2022
;
198
:
1051
64
.

15

Ebrahimi
M
,
Mohammadi-Asl
J
,
Rahim
F
.
The worldwide molecular spectrum and distribution of thalassaemia: a systematic review
.
Ann Hum Biol
2021
;
48
:
307
12
.

16

Vinciguerra
M
,
Leto
F
,
Cassara
F
,
Tartaglia
V
,
Malacarne
M
,
Coviello
D
, et al.
Incidental detection of a chromosomal aberration by array-CGH in an early prenatal diagnosis for monogenic disease on coelomic fluid
.
Life (Basel)
2022
;
13
:
20
.

17

Jiang
F
,
Mao
AP
,
Liu
YY
,
Liu
FZ
,
Li
YL
,
Li
J
, et al.
Detection of rare thalassemia mutations using long-read single-molecule real-time sequencing
.
Gene
2022
;
825
:
146438
.

18

Xu
A
,
Chen
W
,
Xie
W
,
Wang
Y
,
Ji
L
.
Hemoglobin variants in southern China: results obtained during the measurement of glycated hemoglobin in a large population
.
Clin Chem Lab Med
2020
;
59
:
227
32
.

19

Suhaimi
SA
,
Zulkipli
IN
,
Ghani
H
,
Abdul-Hamid
M
.
Applications of next generation sequencing in the screening and diagnosis of thalassemia: a mini-review
.
Front Pediatr
2022
;
10
:
1015769
.

20

Erlich
HA
,
Lopez-Pena
C
,
Carlberg
KT
,
Shih
S
,
Bali
G
,
Yamaguchi
KD
, et al.
Noninvasive prenatal test for beta-thalassemia and sickle cell disease using probe capture enrichment and next-generation sequencing of DNA in maternal plasma
.
J Appl Lab Med
2022
;
7
:
515
31
.

21

Cao
Y
,
Ha
SY
,
So
CC
,
Tong
MT
,
Tang
CS
,
Zhang
H
, et al.
NGS4THAL, a one-stop molecular diagnosis and carrier screening tool for thalassemia and other hemoglobinopathies by next-generation sequencing
.
J Mol Diagn
2022
;
24
:
1089
99
.

22

Liang
Q
,
Gu
W
,
Chen
P
,
Li
Y
,
Liu
Y
,
Tian
M
, et al.
A more universal approach to comprehensive analysis of thalassemia alleles (CATSA)
.
J Mol Diagn
2021
;
23
:
1195
204
.

23

Luo
S
,
Chen
X
,
Zeng
D
,
Tang
N
,
Yuan
D
,
Zhong
Q
, et al.
The value of single-molecule real-time technology in the diagnosis of rare thalassemia variants and analysis of phenotype-genotype correlation
.
J Hum Genet
2022
;
67
:
183
95
.

24

Long
J
,
Sun
L
,
Gong
F
,
Zhang
C
,
Mao
A
,
Lu
Y
, et al.
Third-generation sequencing: a novel tool detects complex variants in the alpha-thalassemia gene
.
Gene
2022
;
822
:
146332
.

25

Li
H
.
Minimap2: pairwise alignment for nucleotide sequences
.
Bioinformatics
2018
;
34
:
3094
100
.

26

Danecek
P
,
Bonfield
JK
,
Liddle
J
,
Marshall
J
,
Ohan
V
,
Pollard
MO
, et al.
Twelve years of SAMtools and BCFtools
.
Gigascience
2021
;
10
:
giab008
.

27

Bonfield
JK
,
Marshall
J
,
Danecek
P
,
Li
H
,
Ohan
V
,
Whitwham
A
, et al.
HTSlib: C library for reading/writing high-throughput sequencing data
.
Gigascience
2021
;
10
:
giab007
.

28

Poplin
R
,
Chang
PC
,
Alexander
D
,
Schwartz
S
,
Colthurst
T
,
Ku
A
, et al.
A universal SNP and small-indel variant caller using deep neural networks
.
Nat Biotechnol
2018
;
36
:
983
7
.

29

Hess
JF
,
Kohl
TA
,
Kotrova
M
,
Ronsch
K
,
Paprotka
T
,
Mohr
V
, et al.
Library preparation for next generation sequencing: a review of automation strategies
.
Biotechnol Adv
2020
;
41
:
107537
.

30

Brait
N
,
Kulekci
B
,
Goerzer
I
.
Long range PCR-based deep sequencing for haplotype determination in mixed HCMV infections
.
BMC Genomics
2022
;
23
:
31
.

Author notes

Weilun Huang and Shoufang Qu are equally contributed to this work.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/pages/standard-publication-reuse-rights)

Supplementary data