Abstract

The tuatara (Sphenodon punctatus) is of “extraordinary biological interest” as the most distinctive surviving reptilian lineage (Rhyncocephalia) in the world. To provide a genomic resource for an understanding of genome evolution in reptiles, and as part of a larger project to produce genomic resources for various reptiles (http://evogen.jgi.doe.gov/second_levels/BACs/Our_libraries.html), a large-insert bacterial artificial chromosome (BAC) library from a male tuatara was constructed. The library consists of 215 424 individual clones whose average insert size was empirically determined to be 145 kb, yielding a genomic coverage of approximately 6.3×. A BAC-end sequencing analysis of 121 420 bp of sequence revealed a genomic GC content of 46.8%, among the highest observed thus far for vertebrates, and identified several short interspersed repetitive elements (mammalian interspersed repeat–type repeats) and long interspersed repetitive elements, including chicken repeat 1 element. Finally, as a quality control measure the arrayed library was screened with probes corresponding to 2 conserved noncoding regions of the candidate sex-determining gene DMRT1 and the DM domain of the related DMRT2 gene. A deep coverage contig spanning nearly 300 kb was generated, supporting the deep coverage and utility of the library for exploring tuatara genomics.

The tuataras of New Zealand are the last representatives of a reptilian lineage known as sphenodontids that was contemporaneous with early dinosaurs, around 220 million years ago, justifying their designation as “living fossils.” Two closely related tuatara species (Sphenodon punctatus and Sphenodon guntheri) are the only surviving members of the order Rhynchocephalia (Rest et al. 2003) and are usually placed phylogenetically as a sister taxon to lizards and snakes (Squamata). This placement is supported by a number of morphological traits, particularly skull osteology (Wu 1994). In contrast, molecular analyses vary as to the position of tuataras. Phylogenetic analysis of 6 nuclear protein–coding loci suggested clustering of tuataras with archosaurs (crocodilians and birds) or turtles, rather than with squamates (Hedges and Poling 1999), whereas analysis of small and large mitochondrial ribosomal genes united tuatara and a lizard representative (Zardoya and Meyer 1998). The validity of these molecular analyses, however, has been questioned (Lee 2001). A recent analysis of complete mitochondrial genomes placed the tuatara as sister group to squamate representatives, consistent with morphological data (Rest et al. 2003). Overall, sphenodontids are probably one of the most phylogenetically distinct vertebrate lineages (Daugherty et al. 1990).

Bacterial artificial chromosome (BAC)/P1 artificial chromosome libraries have been constructed for many taxa in the tree of life, such as human (Ioannou et al. 1994), mammals (Cai et al. 1995), birds (Zimmer and Gibbin 1997), fishes (Amemiya et al. 2001), insects (Hong et al. 2003), plants (Woo et al. 1994), and bacteria (Zhu et al. 1997). However, there are no published BAC libraries available from reptiles other than birds so far (although a library from a Western Fence Lizard, Uta stansburniana, is listed on the website of the National Human Genome Research Institute). We have generated a BAC library from a tuatara as part of a larger National Science Foundation project to construct BAC libraries from various key reptile lineages (Couzin 2002; Modi and Crews 2005). These libraries, which include a garter snake (Thamnophis sirtalis), painted turtle (Chrysemys picta), American alligator (Alligator mississipiensis), Gila monster (Heloderma suspectum), and emu (Dromaius novaehollandiae) are available for use by the scientific community via the Joint Genome Institute (http://evogen.jgi.doe.gov/second_levels/BACs/Our_libraries.html).

BAC libraries from the Reptilia will provide inroads to a large number of fundamental questions in phylogeny, genome evolution, and development. With the sequencing of the draft chicken genome (Hillier et al. 2004), reptiles, and increasingly birds, are models for the evolution of sex chromosomes, genetic and temperature-dependent sex determination mechanisms, and genomes as a whole (Axelsson et al. 2005). Reptiles exhibit a wide array of sex determination mechanisms and sex chromosome systems (Sarre et al. 2004). Tuataras, as in many reptiles, exhibit temperature-dependent sex determination (TSD) (Cree, Thompson, and Daugherty 1995). Tuatara eggs develop into 100% females at 18 °C or below and there is a tendency to become males if embryos are exposed to warmer regimes during the temperature-sensitive period of gonadogenesis. The molecular mechanisms by which TSD are regulated in tuataras are totally unknown, although in crocodiles, which are also characterized by TSD, specific genes known to be involved in mammalian sex determination also appear to be involved (Western et al. 2000). In general, however, TSD is poorly characterized compared with the much better understood genetic mechanisms of sex determination of mammals and, increasingly, birds (Smith and Sinclair 2004). Thus, a tuatara BAC library will also facilitate the cloning of sex-determining and -differentiating genes and elucidation of their functions, homologies, and involvement in sex chromosome evolution.

DMRT1 (doublesex male abnormal-3–related transcription factor-1), a novel member of a family of genes possessing highly conserved sequences with an unusual zinc finger DNA binding (DM) domain, was found to exhibit sex-specific dimorphism in expression in amniote species (Smith et al. 1999) and to have a male sex-determining role in both vertebrates and invertebrates (Smith et al. 1999; Matsuda et al. 2002). In eutherian and marsupial mammals, a dominant Y-specific sex-determining gene called Sry has been identified in directing sexual differentiation toward the male pathway (Marshall Graves 2002). Before the discovery of the DMRT1 gene, no equivalent to Sry had been identified in nonmammalian vertebrates, including reptiles with TSD. Seven DM domain genes have been identified in the human and mouse genomes (Ottolenghi et al. 2002), between 6 and 8 in the platyfish genome (Veith et al. 2003), and at least 12 in Caenorhabditis elegans (Zarkower 2002), the individual functions of which are not clear. Because of its conservation throughout animals, DMRT1 provides both an appropriate probe for examining the quality of our large-insert library and an entry point into examining the basis of sex determination in the tuatara. For this reason, we have conducted a preliminary analysis of DMRT genes, the family to which DMRT1, a candidate sex-determining gene, belongs (Haag and Doty 2005). The construction of BAC libraries from Reptilia will fill a major gap in the resources available for understanding vertebrate genome evolution and will pave the way for transferring many of the large-scale approaches to biology and bioinformatics to these nonmodel species. In this paper, we describe the BAC library we have constructed from S. punctatus and the preliminary efforts to characterize its genome and gene content.

Materials and Methods

BAC Library Construction

A blood sample from a male tuatara (ID number 927477) was kindly provided by the Dallas Zoo. BAC library construction procedures were performed following methods previously described (Amemiya et al. 1996) with some modification. Briefly, nucleated erythrocytes were washed and collected by centrifugation. Based on cell numbers observed under low magnification and the tuatara genome size estimate from the literature (Olmo 1981), multiple Bio-Rad plug molds were embedded with cells containing an estimated 10 μg genomic DNA per 80 μl plug using 1% Incert agarose (FMC corporation). All processing of high molecular weight DNA was done in situ in the agarose plugs. Before starting partial digestion, a brief pulsed-field gel electrophoresis (PFGE) step was carried out using a CHEF-Mapper (Bio-Rad, Hercules, CA) so as to remove small, presumably sheared DNA molecules and impurities from the plugs. A series of EcoRI partial digestions were carried out on the embedded DNA using EcoRI methylase as competitor. Once the optimal conditions for restriction digestion were determined, scaled-up partial digests were set up in 500 μl volumes. Partially digested DNAs were electrophoresed using preparative PFGE in a CHEF-Mapper, and appropriate size fractions (100–350 kb) were taken and further analyzed by analytical PFGE in a CHEF-DRIII apparatus (Bio-Rad, Hercules, CA). DNAs were recovered from the gel slices into dialysis bags via electroelution (Strong et al. 1997). Quantified eluted DNAs were used immediately for subsequent BAC ligation procedures.

Linearized, dephosphorylated, and highly purified CopyControl pCC1BAC™ cloning-ready vector (EcoRI) (Epicentre Inc., Madison, WI) was used for ligations. Pilot-scale tests for ligation, drop-dialysis, and electrotransformation of Escherichia coli cells were performed based on previously published protocols (Amemiya et al. 1996; Osoegawa et al. 1998). Prior to transformation, the ligation volumes were reduced to about 20 μl by a drop-dialysis approach. Once the optimal combination of transformation efficiency and insert clone size was empirically determined, large-scale ligations and transformations were performed following the same conditions. Picking of colonies from agar plates to the 384-well microtiter plates was carried out by a robotics colony picker (CP 1000, Norgren Systems, Fairlea, WV). BAC library replication and high-density colony filter spotting were performed with a TAS-BioGrid replicator/spotter (BioRobotics).

Characterization of the Library

To analyze the sizes of cloned insert DNA fragments, 189 randomly selected BAC clone DNAs were isolated by a manual miniprep method, digested with the rare cleaving endonuclease, NotI, and analyzed by PFGE. The insert sizes were estimated by comparison with low-range pulsed-field gel markers (New England Biolabs, Ipswich, MA) run in parallel on the same gels. We calculated the probability (P) of finding any sequence in the library by the formula N = ln(1 − P)/ln(1 − I/GS) (Clarke and Carbon 1976), where N is the number of clones, I is the average insert size, and GS is the haploid genome size in base pairs. The level of genome coverage was estimated by the formula W = NI/GS (Paterson 1996).

Sequence Survey

As a check for library quality, 96 clones were chosen at random and subjected to fluorescent dideoxy sequencing of both insert ends (Zhao et al. 2001). Raw sequence reads were edited to remove cloning vector sequence and to retain only high-quality bases as judged by analysis using Phred/Phrap/Consed, TraceTuner from Paracel, and locally written Perl scripts as described in Zhao et al. (2001). The sequences were analyzed using RepeatMasker version 3.0.16 (Smit et al. 2004).

Analysis of DMRT Genes

The conserved DM domains of DMRT genes were amplified from tuatara genomic DNAs by polymerase chain reaction (PCR) using the following degenerate oligonucleotide primers, employing previously published PCR conditions (Raymond et al. 1999; Huang et al. 2002). CR65 F: TGC GC(ACG) (CA)G(AG) TGC (CA)G(AG) AAC CAC GG; CR67 F: TGC GC(ACG) (CA)G(AG) TGC (CA)G(AG) AAT CAC GG; CR69 R: C(GT)(CG) AG(CG) GC(CG) ACC TG(CG) GCA GCC AT; CR70 R: C(GT)(CG) AG(CG) GC(CG) ACC TG(CG) GCC GCC AT; CR71 R: C(GT)(CG) AG(CG) GC(CG) ACC TG(CG) GCG GCC AT; CR72 R: C(GT)(CG) AG(CG) GC(CG) ACC TG(CG) GCT GCC AT. PCR products were purified by QIAquick Gel Extraction Kit (Qiagen, CA) and transformed into TA vector via TOPO cloning kit (Invitrogen, Carlsbad, CA). Their sequences were determined by Big Dye Terminator Cycle Sequencing Ready Reaction Kit (Perkin Elmer, Foster City, CA) on an ABI 377 automated sequencer. After database searching against GenBank using BLAST (Altschul et al. 1997), 3 clones were identified as DMRT2 DM domains, which were used as probes to screen tuatara BAC filters. The highly conserved DMRT1 noncoding regions B and C were also amplified from tuatara genomic DNAs as described using primers B and C (Brunner et al. 2001) and characterized as above to prepare as probes. The DNA sequences have been deposited in GenBank (DQ984183DQ984186).

Probes were labeled with 32P by using a Redi-Prime kit (Amersham Life Science, Piscataway, NJ). Hybridization and washing were performed at 65 °C as described in Woo et al. (1994). Positive clones identified by colony hybridization were digested with EcoRI and analyzed on 1% agarose gels prior to blotting on Hybond N+ membrane. DMRT domain–containing clones were confirmed by hybridization with the same DMRT probes used in library screening and by PCR/sequence analysis using the primers above. A modification of an agarose gel–based fingerprinting methodology was employed to construct BAC clone contigs using HindIII and BamHI (Marra et al. 1997). Contigs were assembled using the contigs built with fingerprints program (Marra et al. 1997) with a tolerance of 3 and cutoff of e−22. The contig was visualized by Internet Contig Explorer software (Fjell et al. 2003).

DNA sequences of the noncoding B and C regions of DMRT1 and DM domain of DMRT2 amplified from tuatara genomic DNA and isolated BAC clones were aligned with those from additional species (Volff et al. 2003) using ClustalW (Pillai et al. 2005). To determine the phylogenetic relationships of these sequences, a Bayesian analysis was performed using MrBayes version 3.0b4 (Huelsenbeck and Ronquist 2001). For each of the 3 regions, a 2-parameter model of nucleotide substitution in which all transitions and transversions have distinct rates (Kimura 1980) was used in combination with a gamma distribution of rates among sites. Two million cycles were conducted with a burn-in of 100 000 cycles.

Results

BAC Library Construction and Characterization

The Sphenodon BAC library consists of 215 424 clones which have been arrayed into 561 384-well microtiter plates. Replicas of the library were subsequently made from the original copy, and the entire library was gridded on to 12 high-density nylon filters (22 × 22 cm2, 48 plates per filter) for hybridization screening. The average insert size determined by PFGE analysis of 189 randomly selected clones digested with the 8-bp recognition endonuclease, NotI, was estimated to be 145 kb with 88% of inserts larger than 100 kb (Figure 1a). A subset of 24 digested clones is presented in Figure 1b. The percentage of clones without inserts was empirically determined to be 1%. Based on the tuatara's 1C genome size of 5.0 pg (Olmo 1981), we estimated the coverage of the library to be around 6.3 genome equivalents. This coverage theoretically provides a 99.8% probability of obtaining any unique sequence in this library, assuming random cloning (Clarke and Carbon 1976). Interestingly, internal NotI sites were observed in 90.4% of clones, and on average each insert had ∼2.8 NotI sites, or about ∼1 NotI site for every 52 kb in the tuatara genome. The frequency of NotI sites (GC^GGCCGC) observed in this library is higher than that would occur by chance for an 8-bp recognition enzyme (once every 64 kb) and may be a consequence of a high G + C content observed in this species and/or a high density of CpG islands (see below) (Amemiya et al. 2001).

Figure 1

Characterization of the tuatara BAC library. (a) Insert size distribution of clones from the tuatara BAC library. (b) PFGE analysis of 24 randomly selected tuatara BAC clones. The DNA inserts were released by digesting with NotI enzyme, which cleaves just outside of the EcoRI cloning site. Flanking lanes are pulsed-field gel low-range molecular weight markers (New England Biolabs; sizes are given in kilobase pairs). The BAC vector size is 8.7 kb.

Sequence Survey

Our single-pass BAC-end sequencing survey of 96 clones yielded a total of 169 sequence reads and 121 420 bp of edited, high-quality sequence. This survey corroborated the results of the NotI analysis by uncovering an overall GC content of 46.8%, practically the highest reported for any vertebrate to date (Belle et al. 2004). Analysis of the collected sequences by RepeatMasker (Table 1) revealed several retroelements, including 5 chicken repeat 1 (CR1) –like long interspersed repetitive elements (LINEs) and 1 mammalian interspersed repeat. Novel short interspersed repetitive elements (SINES) or LINEs undetected by RepeatMasker may also exist. Altogether, those retroelements detected comprised 3283 bp or 2.7% of the sequence. A total of 714 bp (∼0.6%) was comprised of simple sequence repeats (microsatellites). No leucine-rich repeat elements, DNA elements, unclassified, small RNA, or satellites were detected in our sequence survey. Using the protein similarity module in RepeatMasker, we discovered a total of 48 LINE LI, L2, or RTE elements (e−10) comprising 6340 bp (5.2%) of the sequence, indicating somewhat higher sensitivity in retroelement detection using amino acids over nucleotides.

Table 1

Summary of repeated DNA elements found in ∼121 kb of tuatara BAC-end sequences

Number of elementsaTotal length occupied (bp)Percentage of sequence
SINE11350.11
    Mammalian interspersed repeats11350.11
LINEs831482.59
    LINE118170.67
    LINE222460.20
    L3/CR1520851.72
Total interspersed repeats32832.7
Simple repeats167140.59
Low complexity199840.81
Number of elementsaTotal length occupied (bp)Percentage of sequence
SINE11350.11
    Mammalian interspersed repeats11350.11
LINEs831482.59
    LINE118170.67
    LINE222460.20
    L3/CR1520851.72
Total interspersed repeats32832.7
Simple repeats167140.59
Low complexity199840.81
a

Most repeats fragmented by insertions or deletions have been counted as one element.

Table 1

Summary of repeated DNA elements found in ∼121 kb of tuatara BAC-end sequences

Number of elementsaTotal length occupied (bp)Percentage of sequence
SINE11350.11
    Mammalian interspersed repeats11350.11
LINEs831482.59
    LINE118170.67
    LINE222460.20
    L3/CR1520851.72
Total interspersed repeats32832.7
Simple repeats167140.59
Low complexity199840.81
Number of elementsaTotal length occupied (bp)Percentage of sequence
SINE11350.11
    Mammalian interspersed repeats11350.11
LINEs831482.59
    LINE118170.67
    LINE222460.20
    L3/CR1520851.72
Total interspersed repeats32832.7
Simple repeats167140.59
Low complexity199840.81
a

Most repeats fragmented by insertions or deletions have been counted as one element.

BAC-Based Contig of DM-Containing Region

Several lines of evidence suggest that DMRT genes are candidate sex determination genes in vertebrates (Volff et al. 2003). Because of its conservation throughout animals, DMRT provides both an appropriate probe for examining the quality of our large-insert library and an entry point into examining the basis of sex determination in the tuatara. The DM domain of the DMRT2 gene, as well as the noncoding regions B and C of DMRT1, were amplified with tuatara genomic DNA as template, subcloned, and verified by sequencing. The BAC library was screened by a probe mix of noncoding B and C regions of DMRT1 and the DM domain of DMRT2. Potential positive clones were identified. DNAs from these clones were prepared and digested to completion with EcoRI. Three identical filters of electrophoresed clones were blotted and hybridized with the 3 respective probes (Figure 2). Twenty-one clones were positive for the B probe, 2 clones were positive for the C probe, and 11 clones were positive for the DM probe. Two clones (150B6 and 497H14) were positive for all 3 probes. To further validate the identity of these clones, PCR was carried out for these regions using the recombinant BAC clones as templates, and products were subsequently sequenced. All the DMRT1 noncoding B and C positive clones were confirmed in this way. However, for reasons that are unclear, the PCR amplifications from DM domain–positive clones were problematic, and we were unable to verify the identity via DNA sequencing. Nonetheless, restriction fingerprinting (Figure 3) showed that 10 of these BAC clones encompassed the same genomic region, suggesting that these DM-positive clones are linked to those bearing B and C regions. Because DMRT genes are linked as DMRT1-DMRT3-DMRT2 in all vertebrates examined thus far, it is likely that all 3 DMRT genes are present on this contig, which spans about 300 kb. Alternatively, we may have identified a duplicated cassette of B and C regions in the tuatara genome. Phylogenetic analysis of the B and C regions and DM region amplified from genomic DNA (Figure 4) confirms the homology of the probe and genomic sequences to the DM family, and, in the case of the DM region, to DMRT2. The tuatara DMRT1 DM domain does not appear in the DM tree because it was not successfully amplified from tuatara. A similar picture was obtained via analysis of the DM region using amino acid sequences, the substitution model of Jones et al. (1992), and a gamma distribution of rates among sites (not shown).

Figure 2

An example of Southern blot hybridization analysis of positive BAC clones with noncoding B of DMRT1 as a probe. The BAC DNAs were digested with EcoRI enzyme completely. The arrow indicates BAC vector. M is a 1-kb DNA marker. Asterisk indicates also positive for probes encoding the C region and DM domain of DMRT2.

Figure 3

A BAC contig encompassing the noncoding B region of DMRT1 gene based on fingerprint results by Internet Contig Explorer software. Ten BAC clones (515D6, 289C19, 269F20, 11L23, 58J18, 81B4, 82P24, 58N23, 557I22, and 162F5) were assembled in this contig with noncoding B region of DMRT1 as probe.

Figure 4

Top, Bayesian phylogenetic tree of B region sequences (DMRT1) from tuatara and other vertebrates. The tree was rooted with the fish orthologs serving as out-group. The numbers are posterior probability values. The sequence length for B region of Sphenodon is 212 bp. The tuatara DMRT1 DM domain does not appear in the DM tree because it was not successfully amplified from tuatara. Bottom, phylogenetic trees of DM domains from tuatara and other vertebrates. Numbers indicate posterior probability values higher than 70%. The DNA sequence length is 170 bp for Sphenodon TM8 and 141 bp for Sphenodon TM38.

Discussion

We have described the construction and characterization of a BAC library from a tuatara, a species of “extraordinary biological interest” as the most distinctive surviving reptilian genus in the world (Groombridge 1982). Although the tuatara's genome size is more than 40% larger than that of the human (Olmo 1981), the quality of this library appears comparable to that of BAC libraries made from many other vertebrates. Our sequence survey and DMRT hybridization/contig analysis are the first of its kind for a nonavian reptile. Draft genome sequences are now available for several vertebrates: human, chimp, monkey, mouse, rat, cow, dog, zebrafish, 2 puffer fishes, (Fugu rubripes and Tetraodon nigroviridis), Xenopus, and chicken (see http://www.ensembl.org/). The availability of this and other reptile BAC libraries will provide a foundation for future comparative genomics of nonavian reptiles and amniotes generally.

The physiology and natural history of tuataras suggest that their genome may possess a number of derived features. Their basal metabolic rates are extraordinarily low for a reptile (Gans 1983), and this may underlie the large genome size of this species because metabolic rate is known to be inversely correlated with genome size in diverse vertebrates (Waltari and Edwards 2002). Our sequence survey of ∼121 kb suggests that the level of detection of SINEs at the nucleotide level is almost certainly a gross underestimate, whereas our encounter rate (5.2%) of LINEs at the protein level at high stringency is very similar to 1.5 Mb survey of tuatara sequence (5.8%) performed by Shedlock (2006). We are surprised that the encounter rate for retroelements is not greater, given that roughly 5–15% of the chicken genome is comprised of interspersed repeats (Consortium ICG 2004). Our finding of a CR1-like LINE element represents a phylogenetic extension of this repeat class from what is already known in birds, turtles, lizards, and snakes (Vandergon and Reitman 1994; Kajikawa et al. 1997; Wicker et al. 2005). We predict that more in-depth searches for repetitive elements and comparisons with a phylogenetically wider sample of retroelements will uncover considerably more diversity.

The high GC content that was revealed by our sequence survey and from the NotI analysis of individual clone may also represent a derived condition. Hughes et al. (1999) conducted a comprehensive survey of GC contents in reptiles using the technique of CsCl buoyant density gradients and found that crocodilians, turtles, and some snakes have relatively high GC contents (41–43%), whereas many more snakes have lower GC contents (38–41%). Birds also seem to have relatively low GC contents, except in the most GC-rich isochores, which are more extreme than in many ectotherms (Kadi et al. 1993). Furthermore, the GC content of DNA sequence is chromosomal size dependent, a unique phenomenon to sauropsides whose karyotypes consist of macrochromosomes and microchromosomes. Turtle cDNA mapping revealed that microchromosomes tend to contain more GC-rich genes than GC-poor genes (Kuraku et al. 2006). Like genome size, GC content in amniotes seems to have undergone multiple convergent trends, both increases and decreases.

We have constructed a BAC contig in tuatara that encompasses the DMRT1 and DMRT2 region. DMRT genes are linked in the order, DMRT1-DMRT3-DMRT2, in all vertebrates so far examined, with the possible exception of the chicken where DMRT2 has been difficult to identify. Our preliminary analysis indicates that all 3 DMRT genes are localized to the region encompassed by the BAC contig. Reptiles, including the tuatara, are ideal GSD/TSD models, and their genomic analyses will shed light on sex determination mechanisms and their evolution in all vertebrates (Sarre et al. 2004).

In addition to their applications in genome sequencing, physical mapping, and positional cloning, BAC libraries may have also important roles in conservative genetics. This has special importance for endangered species such as the tuatara, which is in danger of becoming extirpated in its native New Zealand (Cree, Daugherty, and Hay 1995). Despite absolute protection of the species and its island habitats, 25% of known populations have become extinct in the past century (Daugherty et al. 1992). Although the total number of surviving tuataras is estimated at around 55 000 (Gaze 2001), many populations are still considered highly susceptible to disturbance and environmental change. It will be a sorry day when we must resort to a BAC library to save species such as tuatara; nonetheless, such genomic resources may well play a role in maintaining species numbers and developing molecular markers for species management.

We thank Dr Ruston Hartegen of the Dallas Zoo for helping in procuring the blood sample from which the tuatara BAC library was made. Members of Chris Amemiya's lab assisted in a variety of ways during the research. Hilary Miller provided helpful discussion and background about tuatara biology and genetics. Nisrine El-Mogharbel provided helpful comments on this manuscript. BAC-end sequences were generated by Shaying Zhao of The Institute for Genomic Research. Andrew Shedlock provided useful advice and advanced access to LINE analyses in tuatara. This work was supported by the National Science Foundation grant IBN-0431717 to S.V.E., C.T.A., and J. Robert Macey. C.T.A. and T.M. were supported, in part, from a grant from the National Institutes of Health (UO1 HG02526).

References

Altschul
S
Madden
T
Schaffer
A
Zhang
J
Zhang
Z
Miller
W
Lipman
D
Gapped BLAST and PSI-BLAST: a new generation of protein database search program
Nucleic Acids Res
1997
, vol. 
25
 (pg. 
3389
-
3402
)
Amemiya
CT
Amores
A
Ota
T
Mueller
GD
Postlethwait
JH
Litman
GW
Generation of a P1 artificial chromosome library of the Southern pufferfish
Gene
2001
, vol. 
272
 (pg. 
283
-
289
)
Amemiya
CT
Ota
T
Litman
GW
Lai
E
Birren
B
Construction of P1 artificial chromosome (PAC) libraries from lower vertebrates
Analysis of nonmammalian genomes
1996
San Diego (CA)
Academic Press
(pg. 
223
-
256
)
Axelsson
E
Webster
MT
Smith
NG
Burt
DW
Ellegren
H
Comparison of the chicken and turkey genomes reveals a higher rate of nucleotide divergence on microchromosomes than macrochromosomes
Genome Res
2005
, vol. 
15
 (pg. 
120
-
125
)
Belle
EM
Duret
L
Galtier
N
Eyre-Walker
A
The decline of isochores in mammals: an assessment of the GC content variation along the mammalian phylogeny
J Mol Evol
2004
, vol. 
58
 (pg. 
653
-
660
)
Brunner
B
Hornung
U
Shan
Z
Nanda
I
Kondo
M
Zend-Ajusch
E
Haaf
T
Roper
HH
Shima
A
Schmid
M
Kalscheuer
VM
Schartl
M
Genomic organization and expression of the double-sex-related gene cluster in vertebrates and detection of putative regulatory regions for DMRT1
Genomics
2001
, vol. 
77
 (pg. 
8
-
17
)
Cai
L
Taylor
JF
Wing
RA
Gallagher
DS
Woo
SS
Davis
SK
Construction and characterization of a bovine bacterial artificial chromosome library
Genomics
1995
, vol. 
29
 (pg. 
413
-
425
)
Clarke
L
Carbon
J
A colony bank containing synthetic ColE1 hybrid plasmids representative of the entire E. coli genome
Cell
1976
, vol. 
9
 (pg. 
91
-
99
)
Consortium
ICG
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution
Nature
2004
, vol. 
432
 (pg. 
695
-
716
)
Couzin
J
NSF's ark draws alligators, algae, and wasps
Science
2002
, vol. 
297
 (pg. 
1638
-
1639
)
Cree
A
Daugherty
CH
Hay
JM
Reproduction of a rare New Zealand reptile, the tuatara Sphenodon punctatus, on rat-free and rat-inhabited islands
Conserv Biol
1995
, vol. 
9
 (pg. 
373
-
383
)
Cree
A
Thompson
MB
Daugherty
CH
Tuatara sex determination
Nature
1995
, vol. 
375
 pg. 
543
 
Daugherty
CH
Cree
A
Hay
JM
Thompson
MB
Neglected taxonomy and continuing extinctions of tuatara (Sphenodon)
Nature
1990
, vol. 
347
 (pg. 
177
-
179
)
Daugherty
CH
Towns
DR
Cree
A
Hay
JM
The roles of legal protection versus intervention in conserving the New Zealand tuatara, Sphenodon
Dev Landsc Manage Urban Plann
1992
, vol. 
7
 (pg. 
247
-
259
)
Fjell
CD
Bosdet
I
Schein
JE
Jones
SJ
Marra
MA
Internet Contig Explorer (iCE)—a tool for visualizing clone fingerprint maps
Genome Res
2003
, vol. 
13
 (pg. 
1244
-
1249
)
Gans
C
Rhodin
AGJ
Miyata
K
Is Sphenodon punctatus a maladapted relict?
Advances in herpetology and evolutionary biology
1983
Cambridge (MA)
Museum of Comparative Zoology, Harvard University
(pg. 
613
-
620
)
Gaze
P
Tuatara recovery plan 2001–2011
2001
Wellington (New Zealand)
New Zealand Department of Conservation
 
(Threatened species recovery plan series no. 47)
Groombridge
B
The IUCN amphibia-reptilia red data book. Part 1
Testudines Crocodylia Rhynchocephalia
1982
Gland (Switzerland)
World Conservation Union
Haag
E
Doty
A
Sex determination across evolution: connecting the dots
PLoS Biol
2005
, vol. 
3
 pg. 
e21
 
Hedges
SB
Poling
LL
A molecular phylogeny of reptiles
Science
1999
, vol. 
283
 (pg. 
998
-
1001
)
Hillier
LW
Miller
W
Birney
E
Warren
W
Hardison
RC
Ponting
CP
, et al. 
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution
Nature
2004
, vol. 
432
 (pg. 
695
-
716
)
Hong
YS
Hogan
JR
Wang
X
Sarkar
A
Sim
C
Loftus
BJ
Ren
C
Huff
ER
Carlile
JL
Black
K
Zhang
HB
Gardner
MJ
Collins
FH
Construction of a BAC library and generation of BAC end sequence-tagged connectors for genome sequencing of the African malaria mosquito Anopheles gambiae
Mol Genet Genomics
2003
, vol. 
268
 (pg. 
720
-
728
)
Huang
X
Cheng
H
Guo
Y
Liu
L
Gui
J
Zhou
R
A conserved family of doublesex-related genes from fishes
J Exp Zool
2002
, vol. 
294
 (pg. 
63
-
67
)
Huelsenbeck
JP
Ronquist
F
MrBayes: Bayesian inference of phylogeny
Bioinformatics
2001
, vol. 
17
 (pg. 
754
-
755
)
Hughes
S
Zelus
D
Mouchiroud
D
Warm-blooded isochore structure in Nile crocodile and turtle
Mol Biol Evol
1999
, vol. 
16
 (pg. 
1521
-
1527
)
Ioannou
PA
Amemiya
CT
Garnes
J
Kroisel
PM
Shizuya
H
Chen
C
Batzer
MA
de Jong
PJ
A new bacteriophage P1-derived vector for the propagation of large human DNA fragments
Nat Genet
1994
, vol. 
6
 (pg. 
84
-
89
)
Jones
DT
Taylor
WR
Thornton
JR
The rapid generation of mutation data matrices from protein sequences
Comput Appl Biosci
1992
, vol. 
8
 (pg. 
275
-
282
)
Kadi
F
Mouchiroud
D
Sabeur
G
Bernardi
G
The compositional patterns of the avian genomes and their evolutionary implications
J Mol Evol
1993
, vol. 
37
 (pg. 
544
-
551
)
Kajikawa
M
Ohshima
K
Okada
N
Determination of the entire sequence of turtle CR1: the first open reading frame of the turtle CR1 element encodes a protein with a novel zinc finger motif
Mol Biol Evol
1997
, vol. 
14
 (pg. 
1206
-
1217
)
Kimura
M
A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences
J Mol Evol
1980
, vol. 
16
 (pg. 
111
-
120
)
Kuraku
S
Ishijima
J
Nishida-Umehara
C
Agata
K
Kuratani
S
Matsuda
Y
cDNA-based gene mapping and GC3 profiling in the soft-shelled turtle suggest a chromosomal size-dependent GC bias shared by sauropsids
Chromosome Res
2006
, vol. 
14
 (pg. 
187
-
202
)
Lee
MSY
Molecules, morphology and the monophyly of diapsid reptiles
Contrib Zool
2001
, vol. 
70
 (pg. 
1
-
22
)
Marra
MA
Kucaba
TA
Dietrich
NL
Green
ED
Bownstein
B
Wilson
RK
McDonald
KM
Hillier
LW
McPherson
JD
Waterston
RW
High-throughput fingerprint analysis of large-insert clones
Genome Res
1997
, vol. 
7
 (pg. 
1072
-
1084
)
Marshall Graves
JA
The rise and fall of SRY
Trends Genet
2002
, vol. 
18
 (pg. 
259
-
264
)
Matsuda
M
Nagahama
Y
Shinomiya
A
Sato
T
Matsuda
C
Kobayashi
T
Morrey
CE
Shibata
N
Asakawa
S
Shimizu
N
Hori
H
Hamaguchi
S
Sakaizumi
M
DMY is a Y-specific DM-domain gene required for male development in the medaka fish
Nature
2002
, vol. 
417
 (pg. 
559
-
563
)
Modi
WS
Crews
D
Sex chromosome and sex determination in reptiles
Curr Opin genet Dev
2005
, vol. 
15
 (pg. 
660
-
665
)
Olmo
E
Evolution of genome size and DNA base composition in reptiles
Genetica
1981
, vol. 
57
 (pg. 
39
-
50
)
Osoegawa
K
Woon
PY
Zhao
B
Frengen
E
Tateno
M
Catanese
JJ
de Jong
PJ
An improved approach for construction of bacterial artificial chromosome libraries
Genomics
1998
, vol. 
52
 (pg. 
1
-
8
)
Ottolenghi
C
Fellous
M
Barbieri
M
McElreavey
K
Novel paralogy relations among human chromosomes support a link between the phylogeny of doublesex-related genes and evolution of sex determination
Genomics
2002
, vol. 
79
 (pg. 
333
-
343
)
Paterson
AH
The DNA revolution
1996
San Diego (CA)
Academic Press
Pillai
S
Silventoinen
V
Kallio
K
Senger
M
Sobhang
S
Tate
J
Velankar
S
Golovin
A
Henrick
K
Rice
P
Stoehr
P
Lopez
R
SOAP-based services provided by the European Bioinformatics Institute
Nucleic Acids Res
2005
, vol. 
33
 (pg. 
W25
-
W28
)
Raymond
C
Kettlewell
J
Hirsch
B
Bardwell
V
Zarkower
D
Expression of Dmrt1 in the genital ridge of mouse and chicken embryos suggests a role in vertebrate sexual development
Dev Biol
1999
, vol. 
215
 (pg. 
208
-
220
)
Rest
JS
Ast
JC
Austin
CC
Waddell
PJ
Tibbetts
EA
Hay
JM
Mindell
DP
Molecular systematics of primary reptilian lineages and the tuatara mitochondrial genome
Mol Phylogenet Evol
2003
, vol. 
29
 (pg. 
289
-
297
)
Sarre
S
Georges
A
Quinn
A
The ends of a continuum: genetic and temperature-dependent sex determination in reptiles
Bioessays
2004
, vol. 
26
 (pg. 
639
-
645
)
Shedlock
AM
Phylogenomic diversity of CR1 LINE elements in reptiles
Syst Biol
2006
 
Forthcoming
Smit
AFA
Hubley
R
Green
P
RepeatMasker Open-3.0.5 [Internet]
2004
 
Available from: www.repeatmasker.org.
Smith
CA
McClive
P
Western
P
Reed
K
Sinclair
AH
Conservation of a sex-determining gene
Nature
1999
, vol. 
402
 (pg. 
601
-
602
)
Smith
CA
Sinclair
AH
Sex determination: insights from the chicken
Bioessays
2004
, vol. 
26
 (pg. 
120
-
132
)
Strong
SJ
Ohta
Y
Litman
GW
Amemiya
CT
Marked improvement of PAC and BAC cloning is achieved using electroelution of pulsed-field gel-separated partial digests of genomic DNA
Nucleic Acids Res
1997
, vol. 
25
 (pg. 
3959
-
3961
)
Vandergon
TL
Reitman
M
Evolution of chicken repeat 1(CR1) elements: evidence for ancient subfamilies and multiple progenitors
Mol Biol Evol
1994
, vol. 
11
 (pg. 
886
-
898
)
Veith
AM
Froschauer
A
Korting
C
Nanda
I
Hanel
R
Schmid
M
Schartl
M
Volff
JN
Cloning of the dmrt1 gene of Xiphophorus maculates: dmY/dmrt1Y is not the master sex-determining gene in the platyfish
Gene
2003
, vol. 
317
 (pg. 
59
-
66
)
Volff
JN
Zarkower
D
Bardwell
VJ
Schartl
M
Evolutionary dynamics of the DM domain gene family in metazoans
J Mol Evol
2003
, vol. 
57
 (pg. 
S241
-
S249
)
Waltari
E
Edwards
SV
The evolutionary dynamics of intron size, genome size, and physiological correlates in archosaurs
Am Nat
2002
, vol. 
160
 (pg. 
539
-
552
)
Western
P
Hary
JL
Marshall Graves
JA
Sinclair
AH
Temperature-dependent sex determination in the American alligator: expression of SF1, WT1 and DAX1 during gonadogenesis
Gene
2000
, vol. 
241
 (pg. 
223
-
232
)
Wicker
T
Robertson
JS
Schulze
SR
Feltus
FA
Magrini
V
Morrison
JA
Mardis
ER
Wilson
RK
Peterson
DG
Paterson
AH
Ivarie
R
The repetitive landscape of the chicken genome
Genome Res
2005
, vol. 
15
 (pg. 
126
-
136
)
Woo
SS
Jiang
J
Gill
BS
Paterson
AH
Wing
RA
Construction and characterization of a bacterial artificial chromosome library of Sorghum bicolor
Nucleic Acids Res
1994
, vol. 
22
 (pg. 
4922
-
4931
)
Wu
XC
Late Triassic-Early Jurassic sphenodontians from China and the phylogeny of the Sphenodontia
1994
Cambridge (UK)
Cambridge University Press
Zardoya
R
Meyer
A
Complete mitochondrial genome suggests diapsid affinities of turtles
Proc Natl Acad Sci USA
1998
, vol. 
95
 (pg. 
14226
-
14231
)
Zarkower
D
Invertebrates may not be so different after all
Novartis Found Symp
2002
, vol. 
244
 (pg. 
115
-
126
)
Zhao
S
Shatsman
S
Ayodeji
B
Geer
K
Tsegaye
G
Krol
M
Gebregeorgis
E
Shvartsbeyn
A
Russell
D
Overton
L
Jiang
L
Dimitrov
G
Tran
K
Shetty
J
Malek
JA
Feldblyum
T
Nierman
WC
Fraser
CM
Mouse BAC ends quality assessment and sequence analyses
Genome Res
2001
, vol. 
11
 (pg. 
1736
-
1745
)
Zhu
H
Choi
S
Johnston
AK
Wing
RA
Dean
RA
A large-insert (130 kbp) bacterial artificial chromosome library of the rice blast fungus Magnaporthe grisea: genome analysis, contig assembly, and gene cloning
Fungal Genet Biol
1997
, vol. 
21
 (pg. 
337
-
347
)
Zimmer
R
Gibbin
AMV
Construction and characterization of a large-fragment chicken bacterial bacterial chromosome library
Genomics
1997
, vol. 
42
 (pg. 
217
-
226
)

Author notes

Corresponding Editor: William Modi