The Complete Chloroplast and Mitochondrial Dna Sequence of Ostreococcus Tauri: Organelle Genomes of the Smallest Eukaryote Are Examples of Compaction

The complete nucleotide sequence of the mt (mitochondrial) and cp (chloroplast) genomes of the unicellular green alga Ostreococcus tauri has been determined. The mt genome assembles as a circle of 44,237 bp and contains 65 genes. With an overall average length of only 42 bp for the intergenic regions, this is the most gene-dense mt genome of all Chlorophyta. Furthermore, it is characterized by a unique segmental duplication, encompassing 22 genes and covering 44% of the genome. Such a duplication has not been observed before in green algae, although it is also present in the mt genomes of higher plants. The quadripartite cp genome forms a circle of 71,666 bp, containing 86 genes divided over a larger and a smaller single-copy region, separated by 2 inverted repeat sequences. Based on genome size and number of genes, the Ostreococcus cp genome is the smallest known among the green algae. Phylogenetic analyses based on a concatenated alignment of cp, mt, and nuclear genes confirm the position of O. tauri within the Prasinophyceae, an early branch of the Chlorophyta.


Introduction
The so-called green lineage (Viridiplantae) is divided into 2 major divisions, namely, Streptophyta and Chlorophyta. Streptophyta contain all known land plants and their immediate ancestors, a group of algae known as ''charophyte green algae'' (e.g., Chaetosphaeridium globosum), whereas Chlorophyta contain the other green algae (e.g., Chlamydomonas reinhardtii) that form a monophyletic assemblage and are a sister group to the Streptophyta (Graham and Wilcox 2000). So far, only 25 complete mt (mitochondrial) genomes have been sequenced for representatives of the green lineage, 17 from Streptophyta and 8 from Chlorophyta. Regarding plastid genomes, 68 genome sequences are available in public databases, of which 60 are from Streptophyta and 8 from Chlorophyta.
The mt genomes of chlorophytes are usually small (25-90 kb), whereas in general a bigger genome size is observed for the streptophytes (from 68 kb for Chara vulgaris to around 400 kb for higher plants). The great majority of these genomes are circular, except for some species of Chlamydomonales that have a linear genome (Vahrenholz et al. 1993). The increase of the genome size observed within Streptophyta does not necessarily reflect an increase in coding capacity. Indeed, the transfer of mt genes to the nucleus over evolutionary time (Brennicke et al. 1993), the enlargement and incorporation of new sequences within the mt intergenic spacers, the loss of genes, the increase of intron size, and the resulting decrease of the coding density are all characteristic for the mt genomes of higher land plants. In angiosperms, the most striking feature is the presence of a multipartite genome structure, which results in high-frequency recombination via repeated sequences in the genome (Fauron et al. 1995), altering the genome copy number, which can result in different phenotypes (Kanazawa et al. 1994;Janska et al. 1998).
All cp (chloroplast) genomes that have been described for land plants have a very conserved genome size, usually around 150 kb covering about 70-80 genes. In contrast, the cp genomes of green algae, although having a rather similar genome size between 150 and 200 kb, show a tremendous variation in gene content, due to massive gene loss, genome erosion, and gene transfer to the nucleus (Grzebyk and Schofield 2003). All cp genomes described so far are circular. Previous studies have shown that, although in green algae (e.g., C. reinhardtii) more genes have been transferred to the nucleus compared with land plants (e.g., tobacco), the rate of gene flow has subsequently slowed down dramatically and the transfer of DNA from cp to the nucleus is now very rare (Lister et al. 2003). However, until very recently (Derelle et al. 2006;this study), there was no chlorophyte that had both its nuclear, cp, and mt genome published, and it therefore remained difficult to quantify precisely the extent of gene transfer from the organelles to the nucleus.
Ostreococcus tauri is a unicellular green alga that was discovered in the Mediterranean Thau lagoon (France) in 1994. With a size less than 1 lm, comparable to that of a bacterium, it is the smallest eukaryotic organism currently described (Courties et al. 1994). Its cellular organization is rather simple with a relatively large nucleus with only 1 nuclear pore, a single chloroplast 1 mitochondrion, 1 Golgi body, and a highly reduced cytoplasm compartment (Chrétiennot-Dinet et al. 1995). A membrane surrounds the cells, but no cell wall can be observed. Apart from this simple cellular structure, the O. tauri nuclear genome is small (12.56 Mb) and is fragmented into 20 chromosomes (Derelle et al. 2006). Phylogenetically, O. tauri belongs to the Prasinophyceae, an early branch of the Chlorophyta (Courties et al. 1998). The presence of only 1 chloroplast and 1 mitochondrion and its basal position in the green lineage makes this alga interesting for studying the structure and evolution of both genomes, whereas comparison with other members of the green lineage sheds light on the evolution of organelle genomes.

Materials and Methods Sequencing
For the sequencing of the nuclear genome, cellular DNA was used for the preparation of the shotgun libraries (Derelle et al. 2006). Consequently, mt and cp sequences were also obtained and identified by their high similarity with genes of other green algae or green plants. Purified DNA was broken by sonication, and after filling ends, DNA fragments ranging from 1 kb to 5 kb were separated in an agarose gel. Blunt-end fragments were inserted into pBluescript II KS (Stratagene, The Netherlands), digested with EcoRV, and dephosphorylated. Plasmid DNA from recombinant Escherichia coli strains was extracted according to the TempliPhi method (Amersham, GE Healthcare, France), and inserts were sequenced on both strands using universal forward and reverse M13 primers and the ET DYEnamic terminator kit (Amersham). Sequences were obtained with MegaBace 1000 automated sequencers (Amersham). Data were analyzed and contigs were assembled using Phred-Phrap (Ewing et al. 1998) and Consed software packages (http://bozeman.mbt.washington.edu/ consed/consed.html). Gaps were filled through primerdirected sequencing using custom made primers.

Gene Prediction and Annotation
All genes were annotated based on their similarity with cp and mt genes that were available in public databases and if necessary manually corrected using Artemis (Rutherford et al. 2000). Homologous relationships between publicly available genes and the O. tauri genes were identified through Blast (Altschul et al. 1990). Also small and large ribosomal subunit RNA genes were identified by Blast. Alignment and secondary structure annotation was done using the DCSE alignment editor (De Rijk and De Wachter 1993). The secondary structure drawings were made using RnaViz (De Rijk et al. 2003). tRNA genes were identified by tRNAscan-SE (Lowe and Eddy 1997) using the option ''search for organellar tRNAs (-O)''. The 5S rRNA gene of the cp genome was identified using the CMSEARCH program from the INFERNAL package (Eddy 2002) with the 5S rRNA covariance model (RF00001) from the RFAM database (Griffiths- Jones et al. 2005).

Sequence Analyses
Pairwise comparison of gene permutations by inversions between different mt and cp genomes was obtained using the GRIMM web server (Tesler 2002). The data sets used contained, respectively, 54 conserved mt and 82 conserved cp genes. As this tool cannot deal with duplicated genes, genes located in the inverted repeats (IRs) were counted only once.
Duplicated sequences within both genomes were identified using DOTTER (Sonnhammer and Durbin 1995). For both genomes (but including only one of the IR sequences), short repeated sequences were identified with REPUTER 3.1 (Kurtz et al. 2001), using the -p (palindromic), -f (forward), -l (minimum length), and -allmax parameters; and MUMMER 3.0 (Kurtz et al. 2004), using the -l (minimum length) and -b (forward and reverse complement matches) options. PIPMAKER (Schwartz et al. 2000) was used to visualize the location of the repeated sequences.
After manual improvement of the alignments using BIOEDIT (Hall 1999), only unambiguously aligned Chloroplast and Mitochondrial Genome of Ostreococcus tauri 957 positions were taken into account for tree construction. TREEVIEW was used to visualize the trees (Page 1996).

Phylogenetic Analyses
Previous phylogenetic analyses based on the 18S rDNA sequence of different Chlorophyta suggested that O. tauri belongs to the Prasinophyceae, an early diverging group within the green plant lineage (Courties et al. 1998). Now, with the availability of the cp, mt, and even nuclear (Derelle et al. 2006) genomes of O. tauri, a more extensive phylogenetic analysis could be performed. To this end, we prepared 2 different data sets, that is, 1 consisting of concatenated cp genes and 1 consisting of a mix of concatenated cp, mt, and nuclear genes (see Materials and Methods). Using these different data sets and different methods of phylogenetic tree construction, O. tauri always clustered with other members of the Chlorophyta, clearly confirming its Chlorophycean heritage ( fig. 1; see supplementary fig. S1, Supplementary Material online). Furthermore, the different classes within the Chlorophyta, namely, Chlorophyceae, Trebouxiophyceae, Ulvophyceae, and Prasinophyceae formed monophyletic groups, well supported by bootstrap analyses. O. tauri and N. olivacea were always grouped together within an early diverging group referred to as the Prasinophyceae, thereby confirming the previous analyses done by Courties (1998).
In our phylogenetic analyses, we have also included the unicellular freshwater alga, M. viride, whose phylogenetic position is still being discussed (previously referred to as the ''enigma of Mesostigma' ' [McCourt et al. 2004]). After being classified as an primitive chlorophyte (Mattox and Stewart 1984;Grzebyk and Schofield 2003;Nozaki et al. 2003), a charophyte (Melkonian 1989;Bhattacharya et al. 1998;Karol et al. 2001;Martin et al. 2002), or as a species branching off prior to the divergence of the Streptophyta and Chlorophyta Turmel et al. 2002b), Petersen et al. (2006) provided unequivocal support for its Streptophycean affiliation, based on the presence of a land plant-specific gapB gene and the absence of this gene in the different orders of chlorophyte green algae. However, our tree based on concatenated cp genes (fig. 1a) clearly support M. viride branching off before the divergence of Chlorophyta and Streptophyta. On the other hand, trees bases on a combination of concatenated plastid, mt, and nuclear genes did group M. viride with the other streptophytes ( fig. 1b, see supplementary fig. S1, Supplementary Material online). In addition, we recently showed (Robbens et al. 2007) that 2 Ostreococcus strains (O. tauri and Ostreococcus lucimarinus) also contain the gapB gene (DQ649078 and DQ649079), making this gene no longer land plant specific as postulated by Petersen et al. (2006). As a matter of fact, all this adds some more mystery to the phylogenetic position of M. viride.

Structure and Gene Content of the mt Genome
The O. tauri mt genome assembles as a circle of 44,237 bp ( fig. 2), with an overall GC content of 38%. This size is similar to the mt genome of another early branching chlorophyte N. olivacea (45,223 bp) (Turmel et al. 1999b). However, in contrast to the N. olivacea genome, the O. tauri mt sequence contains a duplicated region, containing 22 genes and covering 44% (19,542 bp) of the genome (see further). Sixty-five genes (unique open reading frames [ORFs] were not taken into account, and duplicated genes were counted only once) are encoded on both strands, encompassing 93% of the genome, which makes the mt genome of O. tauri the most gene dense among the Chlorophyta. For comparison, both M. viride (Turmel et al. 2002b) and N. olivacea also have sixty-five genes, but only covering 87% and 81% of their genome, respectively (table 1). Among the 65 genes, 36 are protein-encoding genes, 26 are transfer RNAs, and 3 are rRNAs (see supplementary table S1, Supplementary Material online). Two predicted proteins (orf129 and orf153) coding for 129 and 153 amino acids, respectively, did not show any FIG. 2.-Ostreococcus tauri mt genome. Genes located in the unique duplicated region are colored in gray; single-copy genes are in black. The length of the boxes is proportional to their amino acid length. tRNA genes are represented by the 1-letter amino acid code, and the unique ORFs are indicated by orf followed by their amino acid length. Chloroplast and Mitochondrial Genome of Ostreococcus tauri 959 clear similarity to other known genes. The compactness of the O. tauri mt genome is further illustrated by the shortness of the intergenic regions, ranging from 1 to 475 bp, with an average of 42 bp. Only 5 intergenic regions exceed 100 bp, and these are all located in the duplicated region. In addition, there are 3 cases of overlapping genes (trnR1-rnpB, rps14-rpl5, and orf153-trnH). Lastly, in contrast to other members of the green lineage, neither group I nor group II type introns are present in any of the genes. All 26 tRNAs fold into the conventional cloverleaf secondary structure and are able to decode all codons. The small subunit rRNA (SSU rRNA, rns in fig. 2) gene is fragmented into 2 parts, but retains its ability to fold into the normal secondary structure model (see supplementary fig. S2, Supplementary Material online). The fragmentation site is located near the hairpin loop of helix 29 (indicated by gray area) of the secondary structure model (Wuyts et al. 2004), and the location of the 2 fragments has been rearranged in the genome such that both fragments are located on the forward strand but their order is reversed. However, this fragmentation site does not correspond to one of the several fragmentation sites that have been previously identified in the small subunit rRNA genes of chlorophyte mt genomes (Nedelcu et al. 2000 The most striking feature of the O. tauri mt genome is the presence of a large duplicated segment (19,542 bp; shaded box in fig. 2). This duplication is also observed in the partially sequenced mt genome of another Ostreococcus strain (O. lucimarinus; Palenik B, personal communication), thereby excluding erroneous genome assembly. The presence of such a duplicated sequence has not been observed in any other member of the Chlorophyta, except for C. reinhardtii, wherein its mt genome, which is linear instead of circular, terminal IRs of approximately 500 bp have been described (Vahrenholz et al. 1993;table 1). No duplication is present in the mt genome of the charophyte Chara vulgaris (Turmel et al. 2003). The only large repeated sequences previously reported are present in higher land plants (e.g., Arabidopsis thaliana: 366,924 bp, containing repeat sequences of 6.5 and 4.5 kb and Beta vulgaris: 368,799 bp, containing a repeat sequence of 6.2 kb) (Unseld et al. 1997;Kubo et al. 2000). These repeated regions in the mt genome of angiosperms gave rise to a multipartite genome structure (Fauron et al. 1995) and lead to high-frequency intramolecular recombination. Indeed, a master circle, containing the complete genetic information, can lead to different subgenomic circles by homologous recombination via a repeated sequence motif (e.g., the tobacco mt genome can provide 6 different subgenomic circles by homologous recombination between the different repeated sequences) (Knoop 2004;Sugiyama et al. 2005). The presence of this multipartite genome structure enables them to change their gene and genome copy number, resulting in an altered plant phenotype (Kanazawa et al. 1994;Janska et al. 1998).
Short dispersed repeats (SDR) are also thought to play an important role in mt genome rearrangements, thereby altering the gene content and genome size. This is not only true for members of the Chlorophyta, but also in land plants, yeasts, and even animals, where they serve as hot spots for recombination (Pombert et al. 2004). Short dispersed repeats have been described in all known members of the Chlorophyta, although their abundance is highly variable. All Chlorophyta members hold SDRs of at least 15 bp in their genome. This number is reduced to 52 repeats in N. olivacea (Pombert et al. 2006b) and to only 11 in O. tauri. The largest repeats found in O. tauri and N. olivacea are rather short, being 34 bp and 42 bp, respectively. The GC content of the SDRs present in O. tauri does not differ much from the overall GC content of the mt genome (36% for the SDRs vs. 38% for the complete genome). In general, more derived lineages show an increase of the number of SDRs: O. viridis contains 1,206 (Pombert et al. 2006b), Scenedesmus obliquus 4,086 (Nedelcu et al. 2000), and P. akinetum 8,002 (Pombert et al. 2004) SDRs of at least 15 bp long. It seems that after the split of the Prasinophyceae, an increase of SDRs took place (with the exception of the Chlamydomonadales), and it is tempting to correlate this increase with the gene rearrangements that took place within the other members of the Chlorophyta (see further).

Comparison with Other mt Genomes
Comparison of the O. tauri mt genome with 9 other species of the Viridiplantae lineage (Cr: C. reinhardtii, No: N. olivacea, Ov: O. viridis, Pa: P. akinetum, So: S. obliquus, At: A. thaliana, Mp: M. polymorpha, Cg: C. globosum, and Mv: M. viride) unveiled only 9 genes (not including tRNAs), which are common to all these species (table 2). However, when removing C. reinhardtii (Michaelis et al. 1990) and S. obliquus, 2 members of the Chlorophyceae, from this comparison, this number increases to 25 shared genes. When further removing the 2 ulvophyte green algae (O. viridis and P. akinetum), the number of conserved genes increases to 30, thus, representing the gene content conservation between the 2 prasinophytes and the land plants. However, when only considering the protein-coding genes of O. tauri, N. olivacea, and M. viride, 36 genes are shared, which represents 95% of the O. tauri and 92% of the M. viride protein-coding gene content. Apparently, the 960 Robbens et al. gene content conservation between these genomes, which are assumed to represent a more ancestral state, is still very high. One of the 7 protein-coding genes that are absent in the O. tauri mt genome, namely rpl2, could be uncovered in the nuclear genome (see supplementary  (table 2). Furthermore, there is a high degree of synteny between these 2 algae, with 5 gene clusters of at least 5 genes and 1 of 2 genes, which are almost identical in both mt genomes (genes denoted in black in fig. 3). However, when one considers gene polarities, synteny is limited to only 2 gene clusters (12 genes extending from rps11 to rps10 and 5 genes extending from atp6 to cox3). The major difference between both mt genomes is the duplication in O. tauri and the presence of 4 group I introns in N. olivacea (3 within the rnl and 1 in the cob gene) (Turmel et al. 1999b).
A certain degree of synteny can still be detected when adding C. globosum (charophyte) (Turmel et al. 2002a) and Marchantia polymorpha (streptophyte) (Oda et al. 1992) to the 2 previous species (genes in black and gray in fig 3), except for cluster 3 where no clear synteny could be detected among the 4 organisms and; for cluster 2 ( fig. 3), where the genes of M. polymorpha are divided into 2 parts ([atp6, nad6, and trnN] and [cox2 and cox3]). In contrast, synteny conservation still exists in cluster 5 where 9 genes are present in the 4 organisms, all oriented in the same direction, indicating that although the genome size increases from green algae to higher land plants, the gene organization of some clusters are extremely well conserved in evolution.
Additionally, we estimated the number of gene inversions needed to transform the gene organization of one genome into another, thereby providing quantitative measurement of their evolutionary distances. Fifty-four conserved genes (duplicated genes were used only once) of 3 Chlorophyta (O. tauri, N. olivacea, and P. akinetum) and M. viride were used, showing that a minimum of 29 inversions are needed to transform the gene organization of O. tauri into that of N. olivacea. When comparing O. tauri with the other mt genomes, almost twice as many inversions are needed (50 for both P. akinetum and M. viride), again indicating the close relationship between the 2 Prasinophyceae.

Comparison with Other cp Genomes
The gene repertoire of the cp genomes of 7 Chlorophyta (Cr: C. reinhardtii, Cv: C. vulgaris, No: N. olivacea, Ov: O. viridis, Pa: P. akinetum, So: S. obliquus, and Ot: O. tauri), 2 Streptophyta (At: A. thaliana and Nt: N. tabacum), and M. viride (Mv) were compared and the results shown in table 4. Fifty-three core genes are shared between both Chlorophyta and Streptophyta (bold gene names), whereas 4 additional core genes (ycf12, tufA, rpl5, and rps9) are present when only considering the Chlorophyta lineage. The 53 core cp genes are involved either in photosynthesis, energy metabolism, or some housekeeping functions. Gene loss and gene transfer to the nucleus is a common feature of cp genomes (Stegemann et al. (2003)), and (Grzebyk and Schofield 2003) reported the loss of 7 genes (rpl21, rpl22, rpl33, rps15, rps16, odpB, and ndhJ) at the base of the Chlorophyta lineage. These genes were also not detected in the O. tauri cp genome, but 5 of them are present in the nuclear genome (see supplementary table S3, Supplementary Material online). In O. tauri, 34 genes are lost in the cp genome compared with other Chlorophyta: 1) the 10 homologs of the mt ndh genes, subunits of the NADH:ubiquinone oxidoreductase. None of these genes were present in the nuclear genome; 2) the genes chlB, chlI, chlL, and chlN involved in the chlorophyl synthesis in dark. In almost all known green algal cp genomes, these 4 genes are present, but not in O. tauri where only chlI was found in the nuclear genome (on chromosome 2). The absence of chlB, chlL, and chlN in the cp or nuclear genome of O. tauri confirms the inability of this organism to produce chlorophyl in dark (Derelle et al. 2006); 3) both the petL and petD genes are absent in O. tauri, whereas they are present in all other studied organisms where they encode a small subunit of the cytochrome b6f complex. The petL has not been transferred to the nucleus, whereas petD could be located on chromosome 7.
FIG. 4.-Ostreococcus tauri plastid genome. Genes located in the inverted repeat sequences are colored in gray; genes in the single copy regions are black. The single intron, located in atpB, is shown as a white box. The length of the boxes is proportional to their amino acid length. tRNA genes are represented by the 1-letter amino acid code and the unique ORFs are indicated by orf followed by their amino acid length. However, it has been shown in C. reinhardtii that a free petL N-terminus is not required for the b6f complex function (Zito et al. 2002); 4) psbM, a part of the photosystem II reaction center, is absent in the cp genome, but is present in the nucleus (chromosome 12); and 5) at least 13 Photosystem I psaA * * * * * * * * * * psaB * * * * * additional genes (petN, minE, minD, ftsI, ftsW, ftsH, rpl12, rpl19, accD, cemA, ccsA, cysA, and cysT) and 3 unknown conserved genes (ycf4, ycf6, and ycf10) have been lost in O. tauri. However, 5 of them are present in the nuclear genome (ycf4, minD, rpl12, rpl19, and cemA/ycf10) (see supplementary table S3, Supplementary Material online). Despite these differences in gene content, 10 conserved blocks, ranging from 2 to 12 genes are shared between O. tauri and N. olivacea, 11 between O. tauri and C. vulgaris, and 12 between O. tauri and M. viride. When aligning the 4 genomes together, 9 conserved blocks of at least 2 genes can be unveiled. However, when adding the cp genome of C. reinhardtii, whose genome is structurally the most comparable to that of O. tauri (see below), almost no conserved blocks shared by all species, can be detected. Comparison of the cp genome of O. tauri with the one of O. viridis, a member of the Ulvophyceae, also showed shared gene clusters. So in general, without considering C. reinhardtii, 9 conserved blocks of at least 2 genes can be unveiled between different members of the Chlorophyta, representing 33 genes (for O. tauri 37% of its gene content), indicating the importance of maintaining certain gene clusters throughout evolution. However, if we compare the gene order of O. tauri cp genome with the 24 ''ancestral '' gene clusters present in N. olivacea and M. viride (de Cambiaire et al. 2006), only 7 of them are completely present in O. tauri, indicating the loss of its ancestral characteristics.
The number of gene inversions necessary to transform the gene organization of one genome into another has been estimated for 4 Chlorophyta (O. tauri, N. olivacea, O. viridis, and C. vulgaris) and for M. viride. An average of 50 inversions is needed to transform the gene organization of O. tauri into that of any other of these cp genomes.
Although some genes and gene clusters are well conserved among green algae, the overall structure of the cp genomes can show remarkable differences. First, both the LSC and the SSC region of O. tauri cp genome contain 41 genes, in contrast to the cp genomes of other green algae (N. olivacea, M. viride, O. viridis, and P. akinetum), where most of the genes are located in the LSC region (Pombert et al. 2006a). Second, the difference in length between the 2 SSCs is much smaller than in other Chlorophyta (e.g., in N. olivacea, the LSC region is 5.6 times larger than its SSC region) or even Streptophyta (e.g., in A. thaliana, the LSC region is 4.7 times larger than its SSC region) (table 3). In this respect, the cp genome of O. tauri is more similar to the cp genome of C. reinhardtii (Maul et al. 2002) for 2 reasons: 1) the SSCs have almost identical lengths and both contain an almost identical number of genes (81 and 78, respectively) and 2) the IRs, which in both cases cover almost 20% of the genome, contain exactly the same genes, orientated in the same direction.
The distribution of different genes over the LSC and SSC regions is highly conserved, not only in the entire streptophyte lineage (M. viride and land plant genomes share essentially the same gene partitioning), but also in the early diverging N. olivacea, indicating that the last common ancestor of all chlorophytes featured a gene partitioning very similar to that observed in land plants. In this respect, Pombert et al. (2006a) created an ancestral cp genome based on the genomes of O. viridis and P. akinetum (both Chlorophyta, belonging to the Ulvophyceae) and compared that with the genome of N. olivacea, which is a prasinophyte and can be considered as ancestral to the 2 ulvophyte. They concluded that the LSC region of the ancestral genome of both Ulvophyceae contained only genes characteristic of the LSC region of N. olivacea and that the SSC region contained genes usually found in the SSC and LSC region of N. olivacea. However, in the O. tauri cp genome, the genes are scattered across the LSC and SSC region, and the previous assumption made by Pombert (2006a) holds no longer true for O. tauri. Because the Prasinophyceae are not a monophyletic group, it is not surprising that the O. tauri cp genome differs significantly from the N. olivacea cp genome and that changes in gene partitioning have occurred independently in O. tauri from those observed in ulvophycean and chlorophycean algae. With the availability of more cp genomes it will become clearer whether O. tauri is an exception to the rule and has undergone specific genome reshuffling or whether different species all have their own independent evolutionary history regarding their cp genome structure.
Also in the cp genome, we looked for the presence of SDRs. Sixty-four repeats larger than 15 bp are present, but none of the detected repeats exceed the length of 25 bp. Almost all these SDRs are located in the coding region of 5 protein-coding genes (rpl23, psbD, psaB, psaA, and psbA) and 5 tRNAs (see supplementary fig. S5, Supplementary Material online). The GC content of the SDRs is comparable to the overall GC content of the cp genome (38% for the SDRs vs. 39.9% for the cp genome). The number of SDRs in N. olivacea is similar, but substantially differs from C. reinhardtii, which cp genome is more similar to the O. tauri cp genome regarding its structure (see above). In the O. tauri cp genome, no direct link can be made between the major reshuffling that took place and the abundance of SDRs, whereas for C. reinhardtii the major rearrangements could be explained by the huge collection of SDRs present in its cp genome. Consequently, another mechanism is probably responsible for the large number of rearrangements present in the cp genome of O. tauri.

Conclusion
Ostreococcus tauri is the smallest eukaryotic organism known to date, and recently, its small (12.56 Mb), but gene dense nuclear genome has been described (Derelle et al. 2006). Here, we present its mt and cp genome, which makes O. tauri one of the very few green lineage organisms for which the 3 genome sequences are available. The 2 O. tauri organellar genomes are small and display both common and special features compared with their closest relatives.
The main difference between the O. tauri and the other Chlorophyta mt genomes is the presence of a unique duplication, previously unobserved in the Chlorophytae. On the other hand, the mt genome of O. tauri, which is the most gene dense among all known green algae, closely resembles the one of Nephroselmis olivacea, another member of the Prasinophyceae. This is illustrated by a number of common characteristics: 1) the gene content is almost identical in both genomes; 2) there is a high degree of synteny between the 2 genomes, which is illustrated by the presence of Chloroplast and Mitochondrial Genome of Ostreococcus tauri 965 a number of conserved gene blocks and by a low number of gene inversions necessary to transform the O. tauri gene structure into the one of N. olivacea; and finally 3) Pombert (2006b) showed that there is an increase in the number of Short Dispersed Repeats (SDR) when moving in the tree from N. olivacea to the more derived lineages within the Chlorophyta. These analyses were confirmed by O. tauri, which contains even fewer SDRs than N. olivacea. All these data clearly show that the mt genome of O. tauri shares the ''after ancestral pattern of evolution typified by the N. olivacea genome. This conclusion for N. olivacea representing an ancestral state (Turmel et al. 1999b) was based on its basal phylogenetic position in the chlorophyte lineage, on the presence of 3 genes (nad10, rpl14, and rnpB) that had not been identified at that time in any other mt genome (today, rpl14 is also identified in P. akinetum), and on its ancestral organizational pattern. These arguments also hold for the O. tauri mt genome, and most likely, both the O. tauri and N. olivacea mt genome represent the most ancestral form known to date for the green lineage. Whether the unique duplication seen in Ostreococcus is restricted to this organism will hopefully become clear with the availability of more mt genomes of basal green algae (e.g., the one of Micromonas pusilla, another prasinophyte which is currently being sequenced; Worden A, personal communication).
The O. tauri cp genome is very compact, and both the genome size and the gene number are the smallest known among the green plants and green algae. Looking at the gene content, the O. tauri cp genome lost many genes compared with other prasinophyte green algae or to M. viride. This is well illustrated by the small number of ancestral gene clusters still present in the O. tauri cp genome where only 7 of the 24 Mesostigma/Nephroselmis gene clusters (de Cambiaire et al. 2006) could be uncovered. Finally, although gene partitioning among LSC and SSC regions is well conserved in all Streptophyta and early-diverging Chlorophyta, the genes in the O. tauri cp genome are randomly distributed between both regions. All these data strongly suggest that, in contrast to its mt genome, the O. tauri cp genome seems to have lost most of the ancestral features observed in the M. viride and N. olivacea genomes.

Supplementary Material
The genome data have been submitted to the European Molecular Biology Laboratory, www.embl.org (accession numbers CR954200 [mt genome] and CR954199 [cp genome]) or can be found at http://bioinformatics.psb.ugent.be/. Supplementary tables S1-S5 and figures S1-S5 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).