Large-Scale Low-Cost NGS Library Preparation Using a Robust Tn5 Purification and Tagmentation Protocol

Efficient preparation of high-quality sequencing libraries that well represent the biological sample is a key step for using next-generation sequencing in research. Tn5 enables fast, robust, and highly efficient processing of limited input material while scaling to the parallel processing of hundreds of samples. Here, we present a robust Tn5 transposase purification strategy based on an N-terminal His6-Sumo3 tag. We demonstrate that libraries prepared with our in-house Tn5 are of the same quality as those processed with a commercially available kit (Nextera XT), while they dramatically reduce the cost of large-scale experiments. We introduce improved purification strategies for two versions of the Tn5 enzyme. The first version carries the previously reported point mutations E54K and L372P, and stably produces libraries of constant fragment size distribution, even if the Tn5-to-input molecule ratio varies. The second Tn5 construct carries an additional point mutation (R27S) in the DNA-binding domain. This construct allows for adjustment of the fragment size distribution based on enzyme concentration during tagmentation, a feature that opens new opportunities for use of Tn5 in customized experimental designs. We demonstrate the versatility of our Tn5 enzymes in different experimental settings, including a novel single-cell polyadenylation site mapping protocol as well as ultralow input DNA sequencing.


II
(3 ng/µl -30 ng/µl). Tagmentation was performed with cDNA at a concentration of approximately 150 pg/µl using the tagmentation protocol developed in this study. B) Heat map analysis of gene counts demonstrating high correlation between samples processed with either in-house produced Tn5 R27S,E54K,L372P or Nextera XT DNA library preparation kit. Importantly, we detect high correlations between samples processed with either of these enzymes (r > 0.975, see color code on the right side). C) Reproducibility of gene expression analysis. The fraction of genes detected in replicate libraries at a given RPKM (left) and the coefficient of variation across replicate libraries at a given RPKM (right) are shown. Genes were binned by RPKM and average values for each bin are shown. D) Read coverage over metagene (transcription unit) when using our homemade Tn5 for tagmentation. We generated a metagene of 100 bins in which "0" reflecting the transcriptional start site and "100" the termination site. E) Heat scatter showing correlation of read counts between NGS libraries processed with a fresh Tn5 batch or a Tn5 batch stored at -20°C for 14 months. Tagmentation was performed on the same cDNA at a concentration of 150 pg/µl.

Mass Spectrometry
All chemical stock solutions used in this protocol were prepared in 100 mM ammonium bicarbonate buffer pH 8.5 unless stated differently.

Sample preparation
The elution fragments from the SEC runs were analysed via SDS-page. For mass spectrometry analysis, the bands of interest were cut from the SDS-Page with a clean scalpel. Gel pieces were further cut into 1 mm cubes for preparation prior to in-gel digestion. First, the gel pieces were washed with water, then shrunk with acetonitrile for 30 minutes at 56°C prior to reduction using 10 mM DTT. The gel pieces were dehydrated with acetonitrile, followed by an alkylation with 55 mM iodacetamide for 20 minutes in the dark at room temperature. After another dehydration with acetonitrile, the gel pieces were incubated on ice with 1 ng/µL trypsin in 50 mM ammonium bicarbonate, followed by overnight incubation at 37°C. Peptides were extracted from the gel pieces by sonication for 15 minutes. The supernatant was removed and placed in a clean tube. Second extraction was performed as before, with a solution of 50:50 water: acetonitrile, 1 % formic acid (2 x the volume of the gel pieces), and the supernatant was pooled with the first extract. The pooled supernatants were lyophilized via speed vacuum centrifugation. The samples were dissolved in 10 µL of reconstitution buffer (96:4 water: acetonitrile, 0.1% formic acid and analyzed by LC-MS/MS.

LC-MS/MS
Peptides were separated using the nanoAcquity UPLC system (Waters) fitted with a trapping (nanoAcquity Symmetry C18, 5µm, 180 µm x 20 mm) and an analytical column (nanoAcquity BEH C18, 1.7µm, 75µm x 200mm). The outlet of the analytical column was coupled directly to an LTQ Orbitrap Velos (Thermo Fisher Scientific) using the Proxeon nanospray source. Solvent A was water supplemented with 0.1 % formic acid and solvent B was acetonitrile supplemented with 0.1 % formic acid. The samples were loaded with a constant flow of solvent A at 5 µl/min onto the trapping column for 6 minutes. Peptides were eluted via the analytical column at constant flow of 0.3 µl/min. During the elution step, the percentage of solvent B increased in a linear fashion from 3 % to 10 % in 5 minutes, then increased to 40 % in further 10 minutes. The peptides were introduced into the mass spectrometer (Orbitrap Velos Pro, Thermo) via a Pico-Tip Emitter 360 µm OD x 20 µm ID; 10 µm tip (New Objective) and a spray voltage of 2.2 kV was applied. The capillary temperature was set at 300°C. Full scan MS spectra with mass range of 300-1700 m/z were acquired in profile mode in the FT with resolution of 30.000. The filling time was set at maximum of 500 ms with limitation of 106 ions. The most intense ions (up to 15) from the full scan MS were selected for sequencing in the LTQ. Normalized collision energy of 40 % was used and the fragmentation was performed after accumulation of 3 x 104 ions or after filling time of 100 ms for each precursor ion (whichever occurred first). MS/MS data was acquired in centroid mode. Only multiply charged (2+, 3+, 4+) precursor ions were selected for MS/MS. The dynamic exclusion list was restricted to 500 entries with maximum retention period of 30 seconds and relative mass window of 10 ppm. In order to improve the mass accuracy, a lock mass correction using a background ion (m/z 445.12003) was applied.

Data analysis
Acquired data was processed by IsobarQuant (Franken et al., 2015) and Mascot (v2.2.07) and searched against a Uniprot E. coli proteome database (UP000000625) containing common IX contaminants, reversed sequences and the sequences of the modified proteins. The data was searched with the following modifications: Carbamidomethyl (C) (fixed modification), Acetyl (Nterm) and Oxidation (M) (variable modifications). The mass error tolerance for the full scan MS spectra was set to 10 ppm and for the MS/MS spectra to 0.02 Da. A maximum of two missed cleavages was allowed. For protein identification, a minimum of two unique peptides with a peptide length of at least seven amino acids and a false discovery rate below 0.01 were required on the peptide and protein level. Thermal proteome profiling for unbiased identification of direct and indirect drug targets using multiplexed quantitative mass spectrometry. Nat Protoc, 10, 1567-1593

Expression and purification of Tn5 (R27S),E54K,L372P
Step-by-step protocol for the expression and purification of the Tn5 (R27S),E54K,L372P transposases as well as for the loading of the enzyme and tagmentation-based NGS library preparation.

Purification of Tn5 (R27S),E54K,L372P
A. Resuspend the cell pellet from 1 liter culture in 50 ml running buffer supplemented with cOmplete protease inhibitors.
XII B. Lyse the cells via sonication. We use a Branson sonicator with a 10 mm tip and 4-5 cycles of 30 sec with intermittent cooling with a 50% duty cycle at output 5-6.
D. Add 6 ml of 10% PEI pH 7.2 dropwise to the cleared lysate while constantly stirring the solution.

* PEI removes nucleic acids from the lysate and is required to avoid contamination of E.
coli DNA (bound to the Tn5 protein).
F. Equilibrate a 5 ml prepacked cOmplete His-Tag purification column with running buffer at a flow rate of 1 ml/min. G. Load the cleared lysate onto the equilibrated cOmplete His-Tag purification column at a flow rate of 0.7 ml/min.

* It is important to use the cOmplete His-Tag purification column as it is resistant to DTT and EDTA. Nickel gets stripped off the beads when using a Ni-NTA column.
H. After loading the sample, wash the cOmplete His-Tag purification column with running buffer until the UV_280nm signal returns to baseline at a flow rate of 1 ml/min. T. Aliquot the samples and store them in 25 mM Tris pH 7.5, 800 mM NaCl, 0.1 mM EDTA, 1 mM DTT and 50% glycerol at -20°C. In case the samples need to be shipped, they can be flash-frozen with liquid nitrogen in the SEC buffer as well and stored at -80°C.
All chromatography steps were performed at 4°C on an Åkta Purifier 10 or on an Åkta Pure 25 chromatography system (both from GE Healthcare). To preserve the activity of the protein, it´s important to keep the sample cold throughout the entire purification process.

Tn5 (R27S),E54K,L372P loading and tagmentation-based NGS library preparation
This protocol describes the workflow of the Tn5 loading and tagmentation-based library preparation for dual indexing i5/i7 NGS. Details specific to the 3'RNA-Seq protocol are provided as additional information and marked in green. All oligonucleotides were ordered at HPLC grade from Sigma Aldrich, Germany.

Reagents and buffers needed
XVI

Annealing of the linker oligonucleotides Tn5ME-A/Tn5MErev and Tn5ME-B/Tn5MErev
A. Resuspend lyophilized oligonucleotides in annealing buffer (50 mM NaCl, 40 mM Tris-HCl pH 8.0) to a stock concentration of 100 µM and mix one volume of Tn5ME-A or Tn5ME-B with one volume of Tn5MErev (working stock, 50 µM). Distribute the mix in 10-20 µl aliquots for storage at -20C.