Benchmarking ultra-high molecular weight DNA preservation methods for long-read and long-range sequencing

Abstract Background Studies in vertebrate genomics require sampling from a broad range of tissue types, taxa, and localities. Recent advancements in long-read and long-range genome sequencing have made it possible to produce high-quality chromosome-level genome assemblies for almost any organism. However, adequate tissue preservation for the requisite ultra-high molecular weight DNA (uHMW DNA) remains a major challenge. Here we present a comparative study of preservation methods for field and laboratory tissue sampling, across vertebrate classes and different tissue types. Results We find that storage temperature was the strongest predictor of uHMW fragment lengths. While immediate flash-freezing remains the sample preservation gold standard, samples preserved in 95% EtOH or 20–25% DMSO-EDTA showed little fragment length degradation when stored at 4°C for 6 hours. Samples in 95% EtOH or 20–25% DMSO-EDTA kept at 4°C for 1 week after dissection still yielded adequate amounts of uHMW DNA for most applications. Tissue type was a significant predictor of total DNA yield but not fragment length. Preservation solution had a smaller but significant influence on both fragment length and DNA yield. Conclusion We provide sample preservation guidelines that ensure sufficient DNA integrity and amount required for use with long-read and long-range sequencing technologies across vertebrates. Our best practices generated the uHMW DNA needed for the high-quality reference genomes for phase 1 of the Vertebrate Genomes Project, whose ultimate mission is to generate chromosome-level reference genome assemblies of all ∼70,000 extant vertebrate species.


Introduction
The past two decades have seen genome sequencing become increasingly easy and affordable, driven by advancements in sequencing and computing technologies. Growing accessibility spurred the formation of large-scale consortia, such as the Genome 10K project (G10K), with the goal of generating genome assemblies for many species to enable new scientific discoveries and aid in conservation efforts [1]. However, initial efforts used short read sequencing (< 200 bp), such as Illumina technology, which were later found to often result in genome assemblies that were highly fragmented, incomplete, and plagued with structural inaccuracies [1][2][3]. Subsequently, G10K initiated the Vertebrate Genomes Project (VGP), with the mission of producing high-quality, near-complete, and error-free genome assemblies of all ~70,000 extant vertebrate species [4].
By comparing sequencing data types and assembly algorithms, the VGP consortium determined that it was not possible to obtain high-quality reference assemblies at the chromosomal level without the use of long-reads (e.g. > 10 kb), such as Pacific Biosciences, long-range molecules (e.g. > 50 kb), such as 10X Genomics linked reads, or optical mapping (> 150kb) such as with Bionano Genomics, and Hi-C proximity ligation (> 1 Mb), such as with Arima Genomics, all of which can span repeats thousands of base pairs in size [4]. To take full advantage of these new sequencing and assembly methods, molecules of DNA need to be as long as possible.
While long-read and long-range (LR) data simplify and accelerate the assembly, they come with a major challenge: they require large amounts of very high-quality DNA. For short-read technologies, many nucleic acid isolation methods developed over the years, including the standard phenol-chloroform method [5] had been sufficient. LR technologies require relatively pure DNA in the 10 kb to 300 kb range. Additionally, the Hi-C method requires physical crosslinking of contacting DNA regions within the same chromosomes, thus requiring cell nuclei to be intact before processing and isolation of cross-linked DNA [4]. With Hi-C, 3D interactions within chromosomes serve to assemble contigs or short scaffolds into chromosomal-scale scaffolds. For LR technologies, only a few extraction methods are currently able to produce high molecular weight (HMW) DNA ranging from 45 to 150 kb or ultra-high molecular weight (uHMW) DNA which is over 150 kb long. These include bead-based (MagAttract HMW DNA Kit, Qiagen), high-salt [6], and agarose plug methods (Bionano Prep Soft/Fibrous Tissue Protocol, Bionano Genomics) [7]. More recently, a less laborious thermoplastic magnetic disks (Nanobinds) method was developed by Circulomics [8]. Regardless of their capabilities, the performance of HMW and 4 uHMW DNA extraction methods primarily depend on the type of sample and how it was collected, handled, and preserved.
The long-held "gold standard" in tissue preservation for high-quality DNA isolation has been flashfreezing tissues in liquid nitrogen directly after collection, followed by ultra-cold -80 o C long-term storage [9][10][11][12][13][14]. While liquid nitrogen is readily available in most laboratory setups, its limited availability in many fieldwork conditions can be an insurmountable hurdle. Indeed, a large portion of global biodiversity is located far from labs, and sampling such species will require long expeditions under rustic field conditions. Thus, transporting sufficient amounts of liquid nitrogen from the point of collection to the laboratory is often infeasible and the applicability of flashfreezing outside the lab environment is greatly limited [10,13,15]. Additional considerations specific to the studied species exacerbate the challenge of sample collection and preservation. DNA degradation is promoted by enzymes whose concentrations are likely to be tissue-specific and possibly species-specific. Small organisms provide little tissue, and preferred tissue types may be unavailable. Permitting restrictions also vary widely among species and among countries.
Yet, methods for field sampling in non-model species for the purposes of LR sequencing remain anecdotal or unsubstantiated, as failed attempts are not published and very few preservation experiments have measured fragment sizes relevant to LR technologies [16,17]. Thus, methods that bridge the gaps between uHMW DNA, the lab, and field conditions still require benchmarking.
Here, we perform a series of benchmarking experiments to assess sample preservation methods under laboratory and simulated field conditions and compare the quality of uHMW DNA obtained.
Specifically, we extract uHMW DNA from multiple tissue types of representative vertebrate species, which were collected under various preservation and temperature conditions. For each experimental sample, we evaluate the fragment length, yield, and purity of the uHMW DNA extracted. Based on our findings, we propose a new set of guidelines for tissue preservation, ranging from best to minimally adequate practices for acquiring uHMW DNA from both laboratory and field collected samples, necessary for producing high-quality reference genome assemblies.

Results
In this study, we used the agarose plug method optimized by Bionano Genomics [7] across all species and preservation methods albeit with small protocol variations for fibrous tissues, soft tissues, and blood. We tested six preservation methods ( Fig. 1): 1) flash frozen in liquid nitrogen, which served as the 'gold standard' and our point of reference; 2) 95% ethanol (EtOH), a long preferred method of field preservation of tissues [10,15,18]; 3) 20-25% dimethyl sulfoxide (DMSO) buffer (see Methods), which has been shown to be very effective at permeating tissues and preserving HMW DNA after long-term storage at ambient temperature [19,20]; 4) RNAlater Stabilization Solution (RNAlater; Invitrogen, Waltham, MA, USA), a commonly used preservative that also facilitates transcriptomics; 5) DNAgard tissue and cells (DNAgard; Biomatrica, San Diego, CA, USA), a commercial preservative designed for stabilizing DNA in tissues at room temperature; and 6) Allprotect Tissue Reagent (Allprotect; Qiagen, Hilden, Germany), another commercial preservative targeting stable room-temperature tissue preservation. We exposed preserved samples to different temperatures (4°C, room temperature, and 37°C) for various durations of time (6 hr to 5 months). We did so with up to 6 tissue types (muscle, blood, ovary, spleen, isolated red blood cells (RBCs), and whole-body) from 6 species representing five vertebrate lineages (a mammal, a bird, two turtles, an amphibian, and a bony fish; Fig. 1), for a total of 140 samples (Table S1). We assessed the fragment length distribution and DNA yield for each DNA sample. Statistical analyses were performed using linear models that included type of preservative, temperature/time treatment, vertebrate group, and tissue type as variables.
6 Fragment length distribution analysis. For extractions that yielded a detectable amount of DNA, we measured their fragment length distributions using at least one of two available techniques: Pulsed-field Gel Electrophoresis (PFGE) and the Agilent Femto Pulse system (FEMTO). PFGE was more informative for analyzing uHMW DNA molecules above 200 kb, due to greater dynamic range in molecular weight separation (Fig. S1a), whereas FEMTO was more useful for separating molecules within the 50-165 kb range (Fig. S1b). Overall, the agarose plug method yielded high-quality DNA concentrated in the 300-400 kb range (Fig. 2).
Flash-freezing and EtOH performed better than the other preservation methods in PFGE, and albeit not statistically significant, they had the lowest standard deviation (Fig. 3b). Based on 7 PFGE, EtOH was slightly better than DMSO (Fig. 3b). Based on FEMTO, DMSO was slightly better than EtOH (Fig. S4b). Neither relationship showed significant differences in preservation.
Tissue type. Tissue type did not have a significant effect on fragment length overall (Figs. 3c and S4c, Table S2). However, muscle showed more variability than blood samples in uHMW DNA yield (> 145 kb). The RBCs samples showed the smallest proportion of degradation, while some muscle samples showed the highest degradation (Fig. 3c). In terms of variation between species, the mouse and fish samples showed a higher degree of degradation with respect to temperature treatment than the other species (Figs. 2, S2, and S3). It is unclear if this can be explained by a species-specific temperature sensitivity, or if it is caused by technical variation.
Interactions among variables. In terms of qualitatively assessing combinations of variables, storage in EtOH appeared to perform best at preserving uHMW DNA for all 4 o C refrigerated samples (Fig. 2). Notably, nucleated blood samples refrigerated with no added preservatives were stable for up to one week with no substantial signs of degradation (Fig. 2). An increased proportion of smaller DNA fragments was evident in refrigerated samples preserved using DNAgard, with the exception of turtle RBCs and muscle samples for which DNAgard results were equivalent to other preservation methods (Fig. 2). Fish body samples stored for 16 hr at 4°C showed notable degradation, but mouse spleen samples under the same treatment did not vary substantially from samples stored at 4°C for 6 hr (Fig. 2). Replicate sea turtle RBCs samples showed less variation within treatments for fragment size than for DNA yield (Fig. S5a,b). 8 Mouse muscle, fish muscle, and fish ovary samples showed considerable accumulation of smaller fragment sizes after one week at room temperature, where blood or muscle samples from other species did not show as dramatic an impact (Figs. 2, S2, and S3). However, fish muscle and ovary samples stored at room temperature for just one day still retained high proportions of uHMW DNA with marginal degradation (Fig. S2). For mouse muscle, DMSO, EtOH, or DNAgard did not seem to provide any added DNA protection against room temperature conditions (Figs. week without any preservative (sea turtle RBCs and frog blood) were quite stable and yielded an appreciable fraction of uHMW DNA (Fig. 2). Additionally, sea turtle RBCs samples, when preserved with EtOH or even DNAgard and stored at room temperature for 5 months, yielded a large fraction of workable uHMW DNA (Fig. 2). This suggested that turtle RBCs may be viable for longer durations at room temperature. Additional replicates and further experimentation will be necessary to determine if the isolated RBCs tissue type or some biological difference in turtles is the key to this stability.  (Table   S2). Specifically, whole blood tended to generate the highest DNA yields, followed by spleen, 9 RBCs, whole-body, and ovary, while muscle generated relatively lower yield (Fig. 3f). In post-hoc tests, whole blood, RBCs, and ovary significantly outperformed muscle (vs. whole blood: t = 11.75, p = 0.002; vs. RBCs: t = 8.36, p < 0.001; vs. ovary: t = 3.28, p = 0.01), while the differences between muscle and whole body or spleen were not significant. Whole blood and RBCs also showed significantly higher yields than ovary samples (vs whole blood: t = 3.89, p = 0.002; vs. RBCs: t = 3.36, p = 0.01). Post-hoc comparisons of different temperature treatments or preservation reagents were not significant, possibly due to the higher variance influenced by the other variables of tissue type and species (Fig. 3d-f). Birds tended to have slightly better yields, with a marginally significant effect over non-avian reptiles (t = 3.04, p = 0.02).
Hi-C sequencing. The VGP is currently using Hi-C reads as a standard tool to generate chromosomal scale assemblies [4,21], as well as to phase haplotypes in some cases [22]. These chromosome interactions are captured in situ in the tissue before DNA is isolated and sequencing libraries made. To enable appropriate collection recommendations for use in this technology, we also explore the effect of tissue preservation on the quality of the Hi-C library preparation. Using a single species (zebra finch) we test a subset of tissue preservation methods (flash-frozen, 6 hr at 4°C, one week at room temperature) and tissue types (muscle, blood), with two replicates per treatment combination. These were processed to generate in situ Hi-C chromatin interactions maps against the VGP male reference genome [23,24].
We found that blood samples flash-frozen in EtOH yielded similar results compared to our flashfrozen positive control with no added preservative: 75-80% of all read-pairs were derived from cis interactions within the same chromosomes (Fig. 4a), and among them ~55-60% were derived from long-range (>15 kb) cis interactions. This indicates a high degree of useful long-range intrachromosomal signal necessary for genome assembly. However, storage of blood in DNAgard resulted in the elimination of almost all cis interactions, down to ~10% total, across temperature treatments ( Fig. 4a-c), indicating largely random ligations and the loss of useful signal. Blood refrigerated for 6 hr maintained a high yield of long cis interactions, both when stored in EtOH and with no preservative. Blood samples stored at one week at room temperature in EtOH also yielded mostly long cis interactions similar to the flash-frozen treatments.
Overall, muscle and blood samples performed similarly across all treatments measured using Hi-C reads. They both yielded large amounts of long cis interactions (>15 kb) when flash-frozen or refrigerated at 4°C with no preservative or with EtOH ( Fig. 4a-b, d-e). Muscle and blood samples also responded similarly to preservative treatments, with EtOH samples performing well across treatments and DNAgard samples underperforming across treatments (Fig. 4).

Discussion
During development of the assembly pipeline for the first set of VGP genomes [4], we tested various HMW and uHMW DNA extraction protocols compatible with several LR technologies, including the Qiagen MagAttract HMW DNA, the phenol-chloroform method [5], and the agarose plug protocol. The agarose plug method optimized by Bionano Genomics [7] was the most consistent method for producing a high yield of uHMW DNA suitable across all the LR technologies in the VGP pipeline. This method used agarose as a protective matrix to minimize DNA shearing during the extraction process and had long been shown to be an effective method for isolating megabase-size DNA from organisms including plants, animals, algae, and microbes [7]. In this study, we use only the agarose plug DNA extraction method.
Our study explored the effects of three variables -preservation method, tissue type, and storage temperature-in preserving the high-quality DNA required for generating chromosome-scale genome assemblies in six species representing five major vertebrate lineages. The results 11 identified promising alternatives to the standard flash-freezing method that is not easily performed in the field, particularly the preservation of samples in 95% ethanol (EtOH) or 20-25% DMSO-EDTA (DMSO) at 4°C. We did not test all possible combinations of variables, which would require over 252 tests per species, but focused instead on the salient combinations of tissue types, reagents, and protocols that reflect real-world applications. There are also likely intervening stages of exposure to different temperatures, such as immediately post-mortem, that may have a considerable effect in hotter climates and are not simulated here. Despite these limitations, our results are consistent with samples from the over 136 species we have processed for the VGP to date (NCBI Bioproject PRJNA489243 as of July 13, 2021). We believe that the results presented here can inform the many logistical decisions of field researchers collecting samples from wild populations (Fig. 5).
Temperature exposure was the strongest predictor of fragment length distribution for these data.
The potential of increased temperatures to destabilize DNA is well known, and samples exposed to higher temperatures for a longer period will allow for enzymatic activity that degrades DNA [25].
However, under certain conditions some samples stored at 4°C or even at room temperature show surprising viability. For example, samples preserved in EtOH and refrigerated for up to one week were nearly as good as flash-frozen samples. This is evident through high proportions of uHMW DNA molecules, though with some signs of degradation and variability across species and tissue types.
The ambient temperature of the intended collecting locality should be a major consideration in planning field collections for high-quality samples. Here we test a limited number of samples at 37°C to resemble fieldwork conditions in warmer climates, resulting in no retention of workable amounts of uHMW DNA in any of these samples. Thus, in hotter climates sample cooling or exploring alternative preservatives is critical. Options such as insulated boxes, ice packs, wet ice, dry ice, and electronic coolers should be considered for maintaining samples at low temperatures in the field. To minimize the time before storing in ultra-cold freezers, investigators might also choose to ship samples from the field to the lab before the conclusion of fieldwork. Further experimentation in conditions resembling warmer climates can more precisely define tolerable exposure intervals for sampling targeting uHMW DNA.
The "gold standard" for preserving samples for uHMW DNA extraction remains flash-freezing in liquid nitrogen before ultra-cold storage [9][10][11][12][13][14]. Our results highlight alternative preservation methods that are more readily available in the field. Liquid nitrogen can be challenging to acquire, contain, and transport in many fieldwork settings. Fortunately, samples preserved in EtOH or DMSO perform well with simple refrigeration. Although a small portion of DMSO samples failed (near-zero DNA extracted) for unclear reasons. In addition, these solutions consistently outperform the commercial preservatives RNAlater and DNAgard. Further, DNAgard is not suitable for maintaining long interaction distances for Hi-C library preparation. While these commercial reagents rely on mechanisms that were likely optimized for preserving lower molecular weight nucleic acids, they appear to be harmful to uHMW DNA and chromosomal 3D interactions. Preservatives that promote cell lysis may undermine the stability of DNA if they cannot adequately counter the increased exposure to sources of chemical degradation [14,25,26].
Of the three commercial reagents tested, Allprotect shows the most promising results for preserving uHMW DNA, but more testing is necessary to better evaluate its performance relative to other preservatives and assess its compatibility with LR technologies.
In addition to popular commercial reagents, we evaluate some of the more commonly applied preservation methods today. EtOH has long been used for preserving samples for DNA analysis, and its proficiency at stabilizing specimens continues to be validated [12,18,27,28]. For example, 13 Mulcahy et al. (2016) studied preservative effects on DNA integrity in white perch and blue crab muscle samples, using only a maximum of 45 kb DNA size resolution. Nevertheless, their finding that EtOH generally performs well as a DNA preservative agent is consistent with our results at this DNA size range. While EtOH is a compelling option, it comes with its own logistical considerations. EtOH can be problematic to transport on commercial flights or trains, or to ship in large quantities. Alternatively, DMSO benefits from fewer transport restrictions, but requires laboratory preparation prior to fieldwork and can be hazardous to handle. Commercial preservation reagents are usually more costly than EtOH or DMSO solutions, but are also under less restricted transport regulations.
The negative impact of DNAgard on Hi-C long-distance cis interactions is striking. This solution likely permeates the cell to inhibit nuclease activity, potentially affecting other protein integrity and impeding cross-linking. The increased fraction of inter-chromosomal interactions and decreased fraction of cis-interactions (> 15 kb) together are evidence of DNA degradation. These interchromosomal interactions are counter-productive noise with regard to chromosome-level scaffolding in that they erroneously provide scaffolding links between contigs derived from two different chromosomes. Our Hi-C data analysis also indicates, at least for birds, that EtOH storage of blood at 4°C or room temperature for one week or less tends to yield high-quality Hi-C chromatin interaction maps. Excluding samples in DNAgard, blood seems to be slightly more resistant to reducing chromosome interactions than muscle when stored at 4°C or room temperature for one week, which would be a valuable feature for field collection.
Contrary to the differences in Hi-C performance, we did not find notable differences in DNA fragment length distributions between most tissue types. The exception is whole-body fish samples that were all significantly degraded, regardless of treatment. Potentially, this could owe to the larger mass of tissue taking longer to freeze through or infuse with preservative, hence 14 allowing more time for degradation. However, we did observe substantial differences in total DNA yield, where blood and spleen samples tend to yield a larger amount of DNA while muscle samples produce the least. The comparatively lower DNA yield makes muscle samples a less practical choice in species where nucleated blood is available. Lower yield could also be costlier and more time consuming in the long run, as more DNA extractions would be required to achieve the necessary input amount. For species without nucleated blood (mammals), soft tissue samples such as the spleen outperform muscle in terms of yield. Note that low yield does not necessarily preclude muscle samples from usefulness, especially given they still perform well in terms of fragment length if appropriately collected and stored. We note that, as we demonstrated in a related study [29], blood is often not suitable for uHMW mitochondrial DNA extraction, while muscle tends to yield abundant mitochondrial DNA. This is an important consideration if the goal of collection is to sequence the mitochondrial genome.
Our study considers today's LR sequencing technologies and current DNA isolation protocols.
Time will likely continue to yield new methods for preventing, assessing, and mitigating DNA degradation. Even since the outset of this study, promising new extraction methods have become available for uHMW DNA, such as Nanobind DNA extraction (Circulomics, Baltimore, MD, USA).
Our comparisons focus on maximizing the quality of field-collected input material and we expect this to be largely independent of downstream extraction methods. Our results and experience acquired with uHMW DNA and Hi-C data for more than 136 VGP genomes produced, yield guidelines for tissue type, preservatives, temperature, and other treatments necessary for generating high-quality genome assemblies from several vertebrate lineages, for laboratory and field collected samples ( Table 1).
In planning biobanking for genomic purposes, another important strategy is to avoid or reduce the need for field-preserved samples. Seeking out animals already in captive collections and 15 salvaging material reduces the methodological difficulty of preserving samples. Delaying blood collection, biopsy, or euthanasia of wild-caught specimens can also buy researchers time to move into more amenable preservation conditions such as a field station. However, this poses ethical challenges in the care of animals being held for days or weeks, and it is not feasible for larger animals.
Few studies have explored the effects of preservation methods on uHMW DNA integrity [17], but none that we are aware of have done so in as broad a set of field-relevant conditions as in the present study. Being able to collect samples well-suited for producing high-quality genome assemblies is a major undertaking. Our recommendations will enable many new high-quality sample collections and contribute to establishing a greater and more diverse array of vertebrate genomes from around the world.  (Table S1).

Methods
For this experiment, tissue samples were collected as available at facilities already handling the target species (Fig. 1). The tissue types collected per species are as follows: mouse, spleen and muscle; zebra finch, whole blood and muscle; sea turtle, isolated red blood cells (RBCs); painted turtle, whole blood and muscle; bullfrog, whole blood and muscle; zebrafish, whole body, ovary, and muscle. For all species except the sea turtle and the fish, samples originate from a single individual. In the sea turtle set, duplicate samples were obtained from three individuals. In the fish set tissue samples in some cases originated from different individuals, as their small body size does not allow for sufficient amounts of tissue from a single specimen.
Each taxon required a slightly different handling procedure. All samples except for those from sea turtles were sourced from captive individuals humanely euthanized in a laboratory setting with approved protocols cited below. All soft or fibrous tissue samples were collected in small 20-30 mg pieces until each 2 mL tube had roughly 50-100 mg total to allow for full penetration of the preservative. Mice were euthanized by CO2 treatment in a GasDocUnit (Medres Medical Research GmbH, Cologne, Germany) following the instructions of the manufacturer (DD24.1-5131/451/8, Landesdirektion Sachsen). Skeletal muscle and spleen samples were then dissected and placed in standard cryotubes. Birds were euthanized via isoflurane overdose, and whole blood was collected into chilled sodium heparin-treated 1.5 ml microfuge tubes (IACUC #19101-H). Then 25-50 µL was immediately aliquoted into cryotubes. Sea turtle RBCs samples were collected from wild individuals undergoing medical treatment by drawing whole blood into 2 mL sodium heparin-treated collection tubes and then spinning down to separate RBCs from plasma.
RBCs were then aliquoted into sodium heparin-treated tubes. Painted turtle samples were collected from one individual euthanized via decapitation as part of another study (AUP 20012070). Painted turtle muscle samples were immediately taken from the pectoral girdle and whole blood was drawn from the heart before placement in standard cryotubes. Frog samples were sourced from one captive adult purchased from Rana Ranch in Twin Falls, Idaho, USA. The frog was euthanized using an intracoelomic injection with Euthasol™ or Fatal-Plus™ (pentobarbital and phenytoin) at a dosage of 100 mg/kg. After confirming that a deep plane of anesthesia was reached, the frog was rapidly and doubly pithed cranially and spinally, then decapitated (19085-USDA). Frog muscle tissue samples were immediately taken from the rear legs and blood was drawn from internal veins before placement in standard cryotubes. We extracted fish samples from multiple lab-raised individuals. To euthanize the fish, we used tricaine and then the brain was destroyed with a scalpel (PPL No.70/7606). We collected white muscle and ovary samples which were dissected out and placed into 2 ml cryotubes immediately after euthanasia. Fish whole-body samples were taken by removing the head, intestines, and swim bladder of individual fish and placing the remaining tissue into a cryotube.

Preservation treatments.
A total of 140 freshly collected samples were subjected to different preservation and temperature treatments to test common preservation methods under simulated field or lab conditions (Fig. 1), with flash-frozen samples being used as baseline controls.
Preservation method treatments refer to the preservative agent applied directly to the sample before ultra-cold (-80°C) storage; temperature treatments refer to the temperature exposed and the amount of time the sample remained at that temperature before ultra-cold storage.
All temperature treatments were applied immediately upon dissection of the material and placement into specimen tubes. Samples were exposed to temperature treatments of varying lengths of time in refrigeration (4°C), room temperature (20-25°C), and elevated temperature in an incubator to simulate field conditions in a tropical climate (~37°C). All temperature conditions tested and the samples to which they were applied are as follows: control condition submerged in liquid nitrogen from dissection to ultra-cold storage (all tissue types and species), 6 hr at 4°C (frog blood and muscle, bird blood and muscle, painted turtle blood and muscle, sea turtle RBCs), 16 hr at 4°C (mouse spleen, fish whole body), 1 day at 4°C (fish ovary), 1 week at 4°C (mouse muscle, frog blood and muscle, bird blood and muscle, painted turtle blood and muscle), 1 day at room temperature (fish muscle and ovary), 1 week at room temperature (mouse muscle, frog blood and muscle, bird blood and muscle, painted turtle blood and muscle, sea turtle RBCs, fish muscle and ovary), 4 weeks at room temperature (fish muscle and ovary), 5 months at room temperature (sea turtle RBCs), and 1 week at 37°C (mouse muscle). Storage time at -80°C after treatment and before DNA extraction varied slightly between samples, but such variation is DMSO was tested on all species and tissue types except sea turtle RBCs. No-preservative treatments were tested on bullfrog blood, bird blood, painted turtle blood, and sea turtle RBCs.
Allprotect was tested on mouse spleen and muscle and fish body. RNAlater was tested on fish ovary and muscle samples.
To gain insights into variation within these treatments, isolated RBCs samples were collected from three different sea turtle individuals and processed separately as biological and technical replicates. The third replicate had insufficient material to test all treatments. DNA extraction. We extracted DNA from all tissue samples using the agarose plug protocol as below at VGP data production hubs at the Rockefeller University, Wellcome Sanger Institute, and MPGI Max Planck Institute Dresden (Table S1). This method was established, at the time of this experiment, as standard protocol for long-read sequencing in all VGP projects [4]. Hi-C library preparation and sequencing. Because Hi-C methods require intact cell nuclei, we tested a subset of bird samples from our preservation experiments directly using the Arima-HiC platform. We tested blood and muscle samples in three different treatments: without preservatives, in EtOH, and in DNAgard. Each preservation method was subjected to three temperature treatments: immediately flash-frozen, 6 hr at 4°C, and one week at room temperature (20-25°C). After temperature treatment, each sample was moved to -80ºC. Blood with no 22 preservative at room temperature for one week was excluded from this set. Two technical replicates of each sample were prepared and sequenced at Arima Genomics following their standard protocol. We measured the performance of Arima-HiC runs by mapping the sequence reads to the zebra finch reference genome (GCA_003957565.1) to determine the proximity of ligated sequence pairs. Assessments were made based on the ratios of cis (intra-chromosome) to trans (inter-chromosome) read pairs as well as the total percentage comprised of long-distance (> 15 kb) cis pairs.

Data availability
Sample information, PFGE measurements, FEMTO measurements, and DNA yield data can be found in the supplemental materials. Raw FEMTO outputs, PFGE gel images, and Hi-C readpairs are available on Dryad (doi:10.5061/dryad.000000041).
Graphical visualization of samples and treatments used in this study. Rows denote preservative treatments and columns temperature treatments. Colors indicate different types of tissue samples (see legend at top right). All samples were transferred to -80°C after the specified temperature treatment, e.g. '6 hr 4C' means stored at 4°C for 6 hours before transfer to -80°C. Abbreviations are as follows: RBCs, isolated red blood cells; EtOH, 95% ethanol; DMSO, a mix of 20-25% dimethyl sulfoxide, 25% 0.5 M EDTA, and 50-55% H2O; DNAgard, DNAgard tissue and cells cat.  PFGE traces are visualized as overlapping ridgeline plots. Each ridgeline plot corresponds to a gel lane and a single DNA extract with brightness converted to a plot profile. The x-axis denotes molecule length scaled via piecewise linear scaling to match across gels of different lengths with a common size standard (Lambda PFG Ladder, New England Biolabs). The x-axis is the same in both columns. The y-axis of each plot is a proportional signal in that particular gel lane from just below the well to just beyond the 48.5 kb ladder peak such that the relatively intense brightness of the well itself is excluded. Colors represent different sample preservation methods, as indicated in the legend at bottom right. All samples were transferred to -80°C after the specified temperature treatment, e.g. '6hr 4C' means stored at 4°C for 6 hours before transfer to -80°C.  DNA yield per input mass was log-transformed and modeled with temperature (d), preservative (e), tissue type (f), and vertebrate group as predictors. Significant relationships from post-hoc comparisons are shown as connecting bars with significance levels: **** p < 0.0001, *** p < 0.001, ** p < 0.01, * p < 0.05. Sample sizes for each factor are given along the x-axis. Hi-C reads being long-range cis pairs, which reflects an efficient capture of long-range interactions needed for genome scaffolding and haplotype phasing. Hi-C data was generated by Arima Genomics following their standard protocol.  Compiled here are guidelines based on the best-performing protocols tested in this study and broadly in the Phase 1 VGP genomes.

Tissue selection
Tissues listed in decreasing preference. Multiple tissue types should be collected when possible.  Figure 1 Click here to access/download; Figure;Fig.1   Click here to access/download; Figure;Fig.4