Abstract

Mitochondrial DNA sequences are frequently transferred into the nuclear genome, giving rise to numts (nuclear DNA sequences of mitochondrial origin). So far, the evolutionary history of numts has largely been studied by using single genomes. Here, we present the first attempt to study numt evolution in a comparative manner by using a pairwise genomic alignment. The total number of numts was estimated to be 452 in human and 469 in chimpanzee. numts that were found in both genomes at identical loci were deemed to be orthologous; 391 numts (>80%) were classified as such. The preponderance of orthologous numts is due to the very short divergence time between the 2 hominoids. The rest of numts were deemed to be nonorthologous. Nonorthologous numts were subdivided into 1) ancestral numts that have lost an ortholog in one species through deletion (12 in human and 11 in chimpanzee), 2) new numts acquired by the insertion of a mitochondrial sequence after the divergence of the 2 species (34 in human and 46 in chimpanzee), and 3) paralogous numts created by the tandem duplication of a preexisting numt (2 in human). This approach also enabled us to reconstruct the numt repertoire in the common ancestor of humans and chimpanzees (409 numts). Our comparative approach is also useful in identifying the exact boundaries of numts.

Mitochondrial DNA sequences are frequently transferred into the nuclear genome, giving rise to numts (nuclear DNA sequences of mitochondrial origin, Lopez et al. 1994). numts have been described in more than 80 species (Bensasson et al. 2001). For most species, the estimate of numt content and abundance is still incomplete. However, with fully sequenced genomes, it is possible to obtain an accurate estimate of numt abundance (Richly and Leister 2004). There is no correlation between the fraction of noncoding DNA and numt abundance (Richly and Leister 2004). The reason for the variation in numt abundance among genomes is not known. Conceptually, the differences might be due to 1) different rates of numt insertion, 2) different rates of numt deletion, and 3) different rates of numt postinsertional duplication.

All mammalian numt studied to date were found to be functionless, and it is thought that they became pseudogenized on arrival into the nucleus because of the differences between the nuclear and mitochondrial genetic codes (Gellissen and Michaelis 1987; Perna and Kocher 1996). In yeast, numts are transferred under natural conditions during the repair of double-strand breaks (Ricchetti et al. 1999), and it was suggested that this is the cause for the ongoing colonization of different genomes by numts. The continuing process of numt integration into the nuclear genome is evidenced by the finding of numts that have been inserted into the human genome after the human–chimpanzee divergence (Ricchetti et al. 2004). Some of these numts are variable with respect to genomic presence or absence, indicating that they have only arisen recently in the human population. Transposition of numts into genes has also been associated with human diseases (Willett-Brozick et al. 2001; Turner et al. 2003; Goldin et al. 2004).

From human genome data, different estimates of the number of numts have been put forward in the literature (Mourier et al. 2001; Tourmen et al. 2002; Woischnik and Moraes 2002; Bensasson et al. 2003; Richly and Leister 2004). Additionally, phylogenetic methods have been suggested for dating the insertion of numts into the nuclear genome (Mourier et al. 2001; Woischnik and Moraes 2002). Initial results indicated a fairly rapid process of numt insertion, however, some studies ignored the possibility of postinsertional nuclear duplication (e.g., Bensasson et al. 2000) resulting in overestimation of numt insertion rates. Hazkani-Covo et al. (2003) suggested a methodology for dating the insertion of numts into the nuclear genome by using a single nuclear genome sequence and a mitochondrial phylogenetic tree. This methodology had the advantage of being able to detect numt duplication events. We discovered that the rate of numt insertion on the branch leading to humans was much lower than previously reported (Mourier et al. 2001; Woischnik and Moraes 2002). Most numts turned out to be paralogs of preexisting numts, rather than new insertions.

Two numts are defined as orthologous if they are derived from a speciation event, but as paralogous if they are derived from a duplication event. So far, the evolutionary history of numts has largely been studied by means of paralogous comparisons within single genomes (Mourier et al. 2001; Woischnik and Moraes 2002; Hazkani-Covo et al. 2003). The availability of closely related completely sequenced genomes has enabled us to use comparative methods to study directly orthologous numt evolution. We note that by using the methodology of Hazkani-Covo et al. (2003), the existence of orthologous numt in species other than humans was inferred indirectly. That inference, however, yielded a testable prediction. Thus, for example, a numt that was inferred to have been inserted in the common ancestor of human and chimpanzee should possess orthologs in both species. However, this prediction could be wrong if the mitochondrial phylogenetic tree is not the true tree. In addition, this methodology is only applicable to long numts that have sufficient phylogenetic signal. With 2 or more genomes, the presence of orthologous numts can be inferred directly, even when the numts are short.

In the following, we suggest a protocol based on genome alignment to estimate the number of numts in closely related species. We apply this approach to the genomes of human (Lander et al. 2001) and chimpanzee (Pan troglodytes; Mikkelsen et al. 2005), and use the alignment to identify evolutionary events that may have affected numt composition in each genome, as well as to reconstruct the numt makeup in the common ancestor of human and chimpanzee.

Because there are no hot spots for numt insertion (Zischler 2000), the presence of a numt at a particular locus in both genomes was taken to imply orthology (fig. 1). Nonorthologous numts that are present in only one genome are further classified into insertions, partial or total deletions, or tandem duplications (fig. 1). Each such event can take place in either lineage. Nonorthologous numts are identified by a gap in the alignment. The distinction between insertions and deletions is based on the fact that there exists no known mechanism for the precise excision of numts. Thus, if the gap coincides precisely with the boundaries of the numt, an insertion is inferred. If the gap is smaller or larger than the numt in the other genome, we infer the occurrence of a partial or total deletion, respectively. Tandem numt duplications are characterized by adjacent homologous numts and a gap coinciding perfectly with the boundaries of the homolog from the other species. The assumptions used for numt classifications here were also used in PCR-based numt recognition (e.g., Lopez et al. 1994; Zischler et al. 1998; Herrnstadt et al. 1999).

(A) numt classification based on genome alignment of homologous loci between human and chimpanzee. (B) Each evolutionary event is positioned on the inferred branch on the phylogenetic tree.
FIG. 1.—

(A) numt classification based on genome alignment of homologous loci between human and chimpanzee. (B) Each evolutionary event is positioned on the inferred branch on the phylogenetic tree.

Our analyses were based on genomic sequences and annotations from the University of California at Santa Cruz (Karolchik et al. 2004) Genome Center. First, Blast was used to search each of the human and chimpanzee genomes for regions of similarity with conspecific mitochondrial sequences (fig. 2, frame 1). Closely spaced mitochondrial hits were concatenated (fig. 2, frame 2). The distinction between orthologous and nonorthologous numts (as described in fig. 1) was accomplished through a comparison of human and chimpanzee numt preliminary datasets. The comparison was based on the University of California-Santa Cruz genome alignment between human and chimpanzee. The analysis was performed in a reciprocal manner: comparing the human genome to the chimpanzee genome and comparing the chimpanzee genome to the human genome. For a detailed description of the methodology, see Supplementary Material online.

Flowchart of data collection and numt classification in human. Two types of UCSC files were used in the analysis: the nucleotide pairwise alignment file and the alignment net file. The final numt classification is determined after comparison with the chimpanzee genome (for details see Supplementary Methods, Supplementary Material online).
FIG. 2.—

Flowchart of data collection and numt classification in human. Two types of UCSC files were used in the analysis: the nucleotide pairwise alignment file and the alignment net file. The final numt classification is determined after comparison with the chimpanzee genome (for details see Supplementary Methods, Supplementary Material online).

We found a similar number of numts in both genomes: 452 numts in human and 469 numts in chimpanzee (table 1). The total number of numts in the 2 genomes was found to be similar to previous estimates in the literature. Unsurprisingly, because of the short time that has passed since the divergence of the 2 hominoids, 391 numts (87% in human and 84% in chimpanzee) were classified as orthologous, that is, were inserted into the nuclear genome before the divergence between the 2 lineages (table S1 in Supplementary Material online).

Table 1

Numbers and Total Sizes (in Parentheses) of Different numt Types within the Genomes of Human and Chimpanzee

Nonorthologous numts
Orthologous numtsNew insertionsTandem duplicationsEvidence for numt deletionsbIgnored numtsTotal Number of numts
Human391 (395,530–437,048 bp)34 (10,536 bp)1 (1) (846 bp)7 (4) (1,005 bp)20 (25,834 bp)452 (433,751–475,269 bp)
Chimpanzee391 (395,530–437,048 bp)a46 (8,442 bp)011 (1) (2,620 bp)21 (11,691 bp)469 (418,283–459,801 bp)
Nonorthologous numts
Orthologous numtsNew insertionsTandem duplicationsEvidence for numt deletionsbIgnored numtsTotal Number of numts
Human391 (395,530–437,048 bp)34 (10,536 bp)1 (1) (846 bp)7 (4) (1,005 bp)20 (25,834 bp)452 (433,751–475,269 bp)
Chimpanzee391 (395,530–437,048 bp)a46 (8,442 bp)011 (1) (2,620 bp)21 (11,691 bp)469 (418,283–459,801 bp)
a

The total size of orthologous numts in chimpanzee was calculated according to human coordinates (see Supplementary Methods, Supplementary Material online). In order not to run into the risk of classifying the same numt twice, the size of partially deleted numts and very small tandem duplications (whose number appears in parentheses) was added to orthologous size. In addition, tandem duplications and evidence for partial deletion are not counted in the total number of numts. Underlined numts were used to estimate the repertoire of the common ancestor.

b

Human numts are listed as evidence for deletions in the chimpanzee genome; chimpanzee numts are listed as evidence for deletions in the human genome.

Table 1

Numbers and Total Sizes (in Parentheses) of Different numt Types within the Genomes of Human and Chimpanzee

Nonorthologous numts
Orthologous numtsNew insertionsTandem duplicationsEvidence for numt deletionsbIgnored numtsTotal Number of numts
Human391 (395,530–437,048 bp)34 (10,536 bp)1 (1) (846 bp)7 (4) (1,005 bp)20 (25,834 bp)452 (433,751–475,269 bp)
Chimpanzee391 (395,530–437,048 bp)a46 (8,442 bp)011 (1) (2,620 bp)21 (11,691 bp)469 (418,283–459,801 bp)
Nonorthologous numts
Orthologous numtsNew insertionsTandem duplicationsEvidence for numt deletionsbIgnored numtsTotal Number of numts
Human391 (395,530–437,048 bp)34 (10,536 bp)1 (1) (846 bp)7 (4) (1,005 bp)20 (25,834 bp)452 (433,751–475,269 bp)
Chimpanzee391 (395,530–437,048 bp)a46 (8,442 bp)011 (1) (2,620 bp)21 (11,691 bp)469 (418,283–459,801 bp)
a

The total size of orthologous numts in chimpanzee was calculated according to human coordinates (see Supplementary Methods, Supplementary Material online). In order not to run into the risk of classifying the same numt twice, the size of partially deleted numts and very small tandem duplications (whose number appears in parentheses) was added to orthologous size. In addition, tandem duplications and evidence for partial deletion are not counted in the total number of numts. Underlined numts were used to estimate the repertoire of the common ancestor.

b

Human numts are listed as evidence for deletions in the chimpanzee genome; chimpanzee numts are listed as evidence for deletions in the human genome.

We identified 46 previously undescribed postspeciation numts in the chimpanzee. These ranged in size between 37 and 3,076 bp. In addition, we identified 34 numts in human. Our study, thus, increases the number of known human-specific numts (Ricchetti et al. 2004) by 26%, and identifies the shortest (29 vs. 47 bp) and the longest (5,219 vs. 1,323 bp) new numts. Human and chimpanzee postspeciation numts that were found in this study are listed in table 2. The common ancestor of human and chimpanzee is estimated to have lived about 6 Myr ago (Goodman et al. 1998). Thus, the average rates of numt insertion are 5.7 insertions per 1 Myr in human and 7.7 numt insertions per 1 Myr in chimpanzee. The difference is not statistically significant (P < 0.179).

Table 2

Postspeciation (New) numts in the Human and Chimpanzee Genomes. Coordinates of the numts within the Chromosomes and the Mitochondria Are Shown as well as the numt size. Chromosome Names That Contain “Random” Include Unmapped Sequences from the Chromosome

ChromosomeNumt StartNumt EndMitochondria StartMitochondria EndSize
Human
    1137505010375050838935900874
    212127296372127296759564960239
    3233967073339671251768182053
    42818681488186838978638104242
    52149850064149850195613744132
    632548396025483998109861102439
    736865274768652775126131264129
    839765693397658255139827201323
    94123928011239314293399680342
    1044768983147689923149821507493
    11456109869561099999641094131
    124793880797938831022272458232
    1341639201531639203201225112418168
    1457315579073155830108031084341
    15513433521513434043310270154885219
    165165938322165938361121481218740
    17767613632676137371296213067106
    1871450861671450862621615171096
    198100464681100464764148621494584
    2011729480147294817666436805163
    2111122411966122412037146611473272
    221240043704400437923792388089
    231339140488391405589524959471
    2413543437695434389151095231123
    25131077744731077747289841239256
    26174255024942550316101441021168
    2717516577325165838468197471653
    281779291501792915416904694441
    2918283223028323521438214504123
    3018436316044363179579768167192
    3120914457191446122182222342
    322013142959131430013501354343
    33205632453256324601129631303270
    342234553532345535786182622847
Chimpanzee
    11948755809487573023682518151
    21167557915167557996144381451982
    3117875352617875359430937769
    412123511582123512458419850688
    5253678271536783683042313998
    6282799067827995121569016132446
    721465147921465148416710675950
    82168762978168763014152231525937
    921931130541931131498920901596
    1021988499241988499971753182674
    112_random5049999650500065111201118970
    1231134808841134839787168145473095
    133186259921186259958138801391738
    1448822943788229473134731350937
    154_random364037463640386321422259118
    16528122132281222131268134982
    17568511847685119701467014793124
    186_random2115491921154985150981516467
    196_random28204787282048662175225480
    20712645547212645564082608428169
    217137974486137974549156111567464
    2281382242211382243641078610929144
    23942240545422406267950803182
    249931193529311947218281948121
    25107288533672885405100881015770
    261010443594110443616835403767228
    2710_random223737312237404667097020316
    2812103361584103361668145821466685
    2912_random309692173096943278248039216
    3013157669201576702974337542110
    311317844524178446521095011078129
    3213103738537103738696998210141160
    33131098973521098975391215712344188
    3413_random1447948114479533132461329853
    35141810647618106545111891125870
    3614952927479529293137953979185
    371523943932239440011328139770
    3815426921454269244847785081304
    39166275297062753052145961467883
    40176533855365338608153611541656
    41181496590114966085144486415185
    4218_random2915883929158872119431197634
    4318_random32760196327602383393343543
    4419_random2356869323568762160251609470
    452015505542155056039864992562
    462332703341327035831437214614243
ChromosomeNumt StartNumt EndMitochondria StartMitochondria EndSize
Human
    1137505010375050838935900874
    212127296372127296759564960239
    3233967073339671251768182053
    42818681488186838978638104242
    52149850064149850195613744132
    632548396025483998109861102439
    736865274768652775126131264129
    839765693397658255139827201323
    94123928011239314293399680342
    1044768983147689923149821507493
    11456109869561099999641094131
    124793880797938831022272458232
    1341639201531639203201225112418168
    1457315579073155830108031084341
    15513433521513434043310270154885219
    165165938322165938361121481218740
    17767613632676137371296213067106
    1871450861671450862621615171096
    198100464681100464764148621494584
    2011729480147294817666436805163
    2111122411966122412037146611473272
    221240043704400437923792388089
    231339140488391405589524959471
    2413543437695434389151095231123
    25131077744731077747289841239256
    26174255024942550316101441021168
    2717516577325165838468197471653
    281779291501792915416904694441
    2918283223028323521438214504123
    3018436316044363179579768167192
    3120914457191446122182222342
    322013142959131430013501354343
    33205632453256324601129631303270
    342234553532345535786182622847
Chimpanzee
    11948755809487573023682518151
    21167557915167557996144381451982
    3117875352617875359430937769
    412123511582123512458419850688
    5253678271536783683042313998
    6282799067827995121569016132446
    721465147921465148416710675950
    82168762978168763014152231525937
    921931130541931131498920901596
    1021988499241988499971753182674
    112_random5049999650500065111201118970
    1231134808841134839787168145473095
    133186259921186259958138801391738
    1448822943788229473134731350937
    154_random364037463640386321422259118
    16528122132281222131268134982
    17568511847685119701467014793124
    186_random2115491921154985150981516467
    196_random28204787282048662175225480
    20712645547212645564082608428169
    217137974486137974549156111567464
    2281382242211382243641078610929144
    23942240545422406267950803182
    249931193529311947218281948121
    25107288533672885405100881015770
    261010443594110443616835403767228
    2710_random223737312237404667097020316
    2812103361584103361668145821466685
    2912_random309692173096943278248039216
    3013157669201576702974337542110
    311317844524178446521095011078129
    3213103738537103738696998210141160
    33131098973521098975391215712344188
    3413_random1447948114479533132461329853
    35141810647618106545111891125870
    3614952927479529293137953979185
    371523943932239440011328139770
    3815426921454269244847785081304
    39166275297062753052145961467883
    40176533855365338608153611541656
    41181496590114966085144486415185
    4218_random2915883929158872119431197634
    4318_random32760196327602383393343543
    4419_random2356869323568762160251609470
    452015505542155056039864992562
    462332703341327035831437214614243
Table 2

Postspeciation (New) numts in the Human and Chimpanzee Genomes. Coordinates of the numts within the Chromosomes and the Mitochondria Are Shown as well as the numt size. Chromosome Names That Contain “Random” Include Unmapped Sequences from the Chromosome

ChromosomeNumt StartNumt EndMitochondria StartMitochondria EndSize
Human
    1137505010375050838935900874
    212127296372127296759564960239
    3233967073339671251768182053
    42818681488186838978638104242
    52149850064149850195613744132
    632548396025483998109861102439
    736865274768652775126131264129
    839765693397658255139827201323
    94123928011239314293399680342
    1044768983147689923149821507493
    11456109869561099999641094131
    124793880797938831022272458232
    1341639201531639203201225112418168
    1457315579073155830108031084341
    15513433521513434043310270154885219
    165165938322165938361121481218740
    17767613632676137371296213067106
    1871450861671450862621615171096
    198100464681100464764148621494584
    2011729480147294817666436805163
    2111122411966122412037146611473272
    221240043704400437923792388089
    231339140488391405589524959471
    2413543437695434389151095231123
    25131077744731077747289841239256
    26174255024942550316101441021168
    2717516577325165838468197471653
    281779291501792915416904694441
    2918283223028323521438214504123
    3018436316044363179579768167192
    3120914457191446122182222342
    322013142959131430013501354343
    33205632453256324601129631303270
    342234553532345535786182622847
Chimpanzee
    11948755809487573023682518151
    21167557915167557996144381451982
    3117875352617875359430937769
    412123511582123512458419850688
    5253678271536783683042313998
    6282799067827995121569016132446
    721465147921465148416710675950
    82168762978168763014152231525937
    921931130541931131498920901596
    1021988499241988499971753182674
    112_random5049999650500065111201118970
    1231134808841134839787168145473095
    133186259921186259958138801391738
    1448822943788229473134731350937
    154_random364037463640386321422259118
    16528122132281222131268134982
    17568511847685119701467014793124
    186_random2115491921154985150981516467
    196_random28204787282048662175225480
    20712645547212645564082608428169
    217137974486137974549156111567464
    2281382242211382243641078610929144
    23942240545422406267950803182
    249931193529311947218281948121
    25107288533672885405100881015770
    261010443594110443616835403767228
    2710_random223737312237404667097020316
    2812103361584103361668145821466685
    2912_random309692173096943278248039216
    3013157669201576702974337542110
    311317844524178446521095011078129
    3213103738537103738696998210141160
    33131098973521098975391215712344188
    3413_random1447948114479533132461329853
    35141810647618106545111891125870
    3614952927479529293137953979185
    371523943932239440011328139770
    3815426921454269244847785081304
    39166275297062753052145961467883
    40176533855365338608153611541656
    41181496590114966085144486415185
    4218_random2915883929158872119431197634
    4318_random32760196327602383393343543
    4419_random2356869323568762160251609470
    452015505542155056039864992562
    462332703341327035831437214614243
ChromosomeNumt StartNumt EndMitochondria StartMitochondria EndSize
Human
    1137505010375050838935900874
    212127296372127296759564960239
    3233967073339671251768182053
    42818681488186838978638104242
    52149850064149850195613744132
    632548396025483998109861102439
    736865274768652775126131264129
    839765693397658255139827201323
    94123928011239314293399680342
    1044768983147689923149821507493
    11456109869561099999641094131
    124793880797938831022272458232
    1341639201531639203201225112418168
    1457315579073155830108031084341
    15513433521513434043310270154885219
    165165938322165938361121481218740
    17767613632676137371296213067106
    1871450861671450862621615171096
    198100464681100464764148621494584
    2011729480147294817666436805163
    2111122411966122412037146611473272
    221240043704400437923792388089
    231339140488391405589524959471
    2413543437695434389151095231123
    25131077744731077747289841239256
    26174255024942550316101441021168
    2717516577325165838468197471653
    281779291501792915416904694441
    2918283223028323521438214504123
    3018436316044363179579768167192
    3120914457191446122182222342
    322013142959131430013501354343
    33205632453256324601129631303270
    342234553532345535786182622847
Chimpanzee
    11948755809487573023682518151
    21167557915167557996144381451982
    3117875352617875359430937769
    412123511582123512458419850688
    5253678271536783683042313998
    6282799067827995121569016132446
    721465147921465148416710675950
    82168762978168763014152231525937
    921931130541931131498920901596
    1021988499241988499971753182674
    112_random5049999650500065111201118970
    1231134808841134839787168145473095
    133186259921186259958138801391738
    1448822943788229473134731350937
    154_random364037463640386321422259118
    16528122132281222131268134982
    17568511847685119701467014793124
    186_random2115491921154985150981516467
    196_random28204787282048662175225480
    20712645547212645564082608428169
    217137974486137974549156111567464
    2281382242211382243641078610929144
    23942240545422406267950803182
    249931193529311947218281948121
    25107288533672885405100881015770
    261010443594110443616835403767228
    2710_random223737312237404667097020316
    2812103361584103361668145821466685
    2912_random309692173096943278248039216
    3013157669201576702974337542110
    311317844524178446521095011078129
    3213103738537103738696998210141160
    33131098973521098975391215712344188
    3413_random1447948114479533132461329853
    35141810647618106545111891125870
    3614952927479529293137953979185
    371523943932239440011328139770
    3815426921454269244847785081304
    39166275297062753052145961467883
    40176533855365338608153611541656
    41181496590114966085144486415185
    4218_random2915883929158872119431197634
    4318_random32760196327602383393343543
    4419_random2356869323568762160251609470
    452015505542155056039864992562
    462332703341327035831437214614243

From among the postspeciation numts, only 2 cases of tandem duplication were found (both in the human genome). In the first case, an internal segment of 30 bp within a numt located on chromosome 10 was duplicated once. The second case, in chromosome 12, includes 18 tandem duplications of a 47-bp sequence (fig. 3).

Multiple sequence alignment of 18 tandemly repeated numts in human chromosome 12 (positions 125, 420, 954–125, 422, 037) and the homologous locus on chimpanzee chromosome 10. The alignment to human and chimpanzee mitochondria is also shown. Each repeat is 47 bp in length and aligns to mitochondrial coordinates 4418–4464 (box). The flanking regions of the human internally repeated numt align to human mitochondrial coordinates 4478–4382 and can also be aligned to a single chimpanzee numt. Duplications (Dup_) are numbered in order of their appearance from 5′ to 3′. Identical nucleotides in the alignment columns are indicated by a dot; dashes indicate gaps. Hs, Homo sapiens; Pt, Pan troglodytes.
FIG. 3.—

Multiple sequence alignment of 18 tandemly repeated numts in human chromosome 12 (positions 125, 420, 954–125, 422, 037) and the homologous locus on chimpanzee chromosome 10. The alignment to human and chimpanzee mitochondria is also shown. Each repeat is 47 bp in length and aligns to mitochondrial coordinates 4418–4464 (box). The flanking regions of the human internally repeated numt align to human mitochondrial coordinates 4478–4382 and can also be aligned to a single chimpanzee numt. Duplications (Dup_) are numbered in order of their appearance from 5′ to 3′. Identical nucleotides in the alignment columns are indicated by a dot; dashes indicate gaps. Hs, Homo sapiens; Pt, Pan troglodytes.

The number of events in which numts were deleted from the genome is fairly similar between the 2 species. There are 12 deletion events in human, of which 11 are total deletions and 1 is a partial one. In chimpanzee, there are 11 deletion events, of which 7 are total deletions and 4 are partial. As far as the total deletions are concerned, one can distinguish between 2 separate groups: most of the numts seem to have been deleted from the genome as part of a much larger segment. However, in a few cases, the numt deletion included only a limited flanking region.

The number of nonorthologous numts is not large enough to be able to detect differences in numt evolutionary dynamics (insertion, deletion, or tandem duplication) between the 2 lineages. Still, we are now able to reconstruct the numts constitution in the common ancestor of the 2 hominoids. The number of numts in the common ancestor of human and chimpanzee is estimated at 409. This number includes 391 numts that are still found in the 2 genomes, and a total of 18 numts that were lost from 1 of the 2 genomes. Given the very low rate of numt deletion, the possibility that a numt has been lost in both genomes seems negligible.

We suggest that in comparison to single genome analyses, our methodology resulted not only in a more accurate estimate of the number of numts but also in a more precise identification of their boundaries. First, this protocol distinguishes between orthologous and nonorthologous numts. Second, by using genome alignment, we identified orthologous numts that escaped detection by the usual Blasting of mitochondrial sequences against the nuclear genome. In 145 out of 391 cases, numts were identified in only one of the genomes when the Blast analysis was used. However, in the majority of cases, alignment of those numts to the corresponding fragment in the second genome revealed a cryptic or quasi-cryptic ortholog. In 15 cases, the existence of orthologous numts in chimpanzee was inferred on the basis of a small stretch of Ns similar in size to the human numt in the homologous position. Finally, our protocol enables a more precise identification of the genomic coordinates of numts. The comparative method allows concatenation of fragments that may otherwise be identified as independent numts.

We thank Shay Covo and Tal Dagan for their help. This work was supported in part by a grant (DBI-0543342) from the National Science Foundation.

References

Bensasson
D
Feldman
MW
Petrov
DA
,
Rates of DNA duplication and mitochondrial DNA insertion in the human genome
J Mol Evol
,
2003
, vol.
57
(pg.
343
-
354
)
Bensasson
D
Zhang
D
Hartl
DL
Hewitt
GM
,
Mitochondrial pseudogenes: evolution's misplaced witnesses
Trends Ecol Evol
,
2001
, vol.
16
(pg.
314
-
321
)
Bensasson
D
Zhang
DX
Hewitt
GM
,
Frequent assimilation of mitochondrial DNA by grasshopper nuclear genomes
Mol Biol Evol
,
2000
, vol.
17
(pg.
406
-
415
)
Gellissen
G
Michaelis
G
,
Gene transfer. Mitochondria to nucleus
Ann N Y Acad Sci
,
1987
, vol.
503
(pg.
391
-
401
)
Goldin
E
Stahl
S
Cooney
AM
Kaneski
CR
Gupta
S
Brady
RO
Ellis
JR
Schiffmann
R
,
Transfer of a mitochondrial DNA fragment to MCOLN1 causes an inherited case of mucolipidosis IV
Hum Mutat
,
2004
, vol.
24
(pg.
460
-
465
)
Goodman
M
Porter
CA
Czelusniak
J
Page
SL
Schneider
H
Shoshani
J
Gunnell
G
Groves
CP
,
Toward a phylogenetic classification of primates based on DNA evidence complemented by fossil evidence
Mol Phylogenet Evol
,
1998
, vol.
9
(pg.
585
-
598
)
Hazkani-Covo
E
Sorek
R
Graur
D
,
Evolutionary dynamics of large numts in the human genome: rarity of independent insertions and abundance of post-insertion duplications
J Mol Evol
,
2003
, vol.
56
(pg.
169
-
174
)
Herrnstadt
C
Clevenger
W
Ghosh
SS
Anderson
C
Fahy
E
Miller
S
Howell
N
Davis
RE
,
A novel mitochondrial DNA-like sequence in the human nuclear genome
Genomics
,
1999
, vol.
60
(pg.
67
-
77
)
Karolchik
D
Hinrichs
AS
Furey
TS
Roskin
KM
Sugnet
CW
Haussler
D
Kent
WJ
,
The UCSC Table Browser data retrieval tool
Nucleic Acids Res
,
2004
, vol.
32
(pg.
493
-
496
)
Lander
ES
Linton
LM
Birren
B
et al. ,
(256 co-authors)
.
,
Initial sequencing and analysis of the human genome
Nature
,
2001
, vol.
409
(pg.
860
-
921
)
Lopez
JV
Yuhki
N
Masuda
R
Modi
W
O'Brien
SJ
,
Numt, a recent transfer and tandem amplification of mitochondrial DNA to the nuclear genome of the domestic cat
J Mol Evol
,
1994
, vol.
39
(pg.
174
-
190
)
Mikkelsen
TS
Hillier
LW
Eichler
EE
et al. ,
(67 co-authors)
.
,
Initial sequence of the chimpanzee genome and comparison with the human genome
Nature
,
2005
, vol.
437
(pg.
69
-
87
)
Mourier
T
Hansen
AJ
Willerslev
E
Arctander
P
,
The human genome project reveals a continuous transfer of large mitochondrial fragments to the nucleus
Mol Biol Evol
,
2001
, vol.
18
(pg.
1833
-
1837
)
Perna
NT
Kocher
TD
,
Mitochondrial DNA: molecular fossils in the nucleus
Curr Biol
,
1996
, vol.
6
(pg.
128
-
129
)
Ricchetti
M
Fairhead
C
Dujon
B
,
Mitochondrial DNA repairs double-strand breaks in yeast chromosomes
Nature
,
1999
, vol.
402
(pg.
96
-
100
)
Ricchetti
M
Tekaia
F
Dujon
B
,
Continued colonization of the human genome by mitochondrial DNA
PLoS Biol
,
2004
, vol.
2
pg.
E273
Richly
E
Leister
D
,
NUMTs in sequenced eukaryotic genomes
Mol Biol Evol
,
2004
, vol.
21
(pg.
1081
-
1084
)
Tourmen
Y
Baris
O
Dessen
P
Jacques
C
Malthiery
Y
Reynier
P
,
Structure and chromosomal distribution of human mitochondrial pseudogenes
Genomics
,
2002
, vol.
80
(pg.
71
-
77
)
Turner
C
Killoran
C
Thomas
NS
Rosenberg
M
Chuzhanova
NA
Johnston
J
Kemel
Y
Cooper
DN
Biesecker
LG
,
Human genetic disease caused by de novo mitochondrial-nuclear DNA transfer
Hum Genet
,
2003
, vol.
112
(pg.
303
-
309
)
Willett-Brozick
JE
Savul
SA
Richey
LE
Baysal
BE
,
Germ line insertion of mtDNA at the breakpoint junction of a reciprocal constitutional translocation
Hum Genet
,
2001
, vol.
109
(pg.
216
-
223
)
Woischnik
M
Moraes
CT
,
Pattern of organization of human mitochondrial pseudogenes in the nuclear genome
Genome Res
,
2002
, vol.
12
(pg.
885
-
893
)
Zischler
H
,
Nuclear integrations of mitochondrial DNA in primates: inference of associated mutational events
Electrophoresis
,
2000
, vol.
21
(pg.
531
-
536
)
Zischler
H
Geisert
H
Castresana
J
,
A hominoid-specific nuclear insertion of the mitochondrial D-loop: implications for reconstructing ancestral mitochondrial sequences
Mol Biol Evol
,
1998
, vol.
15
(pg.
463
-
469
)

Author notes

1

Present address: National Evolutionary Synthesis Center, Durham, North Carolina, USA

William Martin, Associate Editor

Supplementary data