The Chloroplast Genomes of the Green Algae Pedinomonas minor, Parachlorella kessleri , and Oocystis solitaria Reveal a Shared Ancestry between the Pedinomonadales and Chlorellales

The green algae belonging to the Chlorophyta—the lineage sister to that comprising the land plants and their charophycean green algal relatives (Streptophyta)—have been subdivided into four classes (Prasinophyceae, Ulvophyceae, Trebouxiophyceae, and Chlorophyceae). Yet the Pedinomonadales, an assemblage consisting of tiny, naked uniﬂagellates with a second basal body, has no clear afﬁliation with these classes and the branching order of the crown chlorophytes remains unknown. To gain an insight into the phylogenetic position of the Pedinomonadales and the relationships among the recognized chlorophyte classes, we have sequenced the chloroplast genomes of Pedinomonas minor (Pedinomonadales) and of two trebouxiophyceans belonging to the Chlorellales, Parachlorella kessleri (Chlorellaceae) and Oocystis solitaria (Oocystaceae), and compared these genomes with those of previously examined streptophytes and chlorophytes, including Chlorella vulgaris (Chlorellaceae). Unlike their Chlorella homolog, the three newly investigated chloroplast DNAs (cpDNAs) carry a large rRNA-encoding inverted repeat (IR) that divides the genome into large and small single-copy regions. In contrast to the situation found for ulvophycean and chlorophycean cpDNAs, the gene contents of the IR and single-copy regions are strikingly similar to that inferred for the common ancestor of chlorophytes and streptophytes. The intronless 98,340-bp Pedinomonas genome is among the chlorophyte cpDNAs featuring the smallest size and most ancestral gene organization. All 105 conserved genes encoded by this genome are included in the gene repertoires of Oocystis (111 genes) and Chlorella (113 genes), with just trnR (ccg) missing from Parachlorella cpDNA. Trees inferred from 71 cpDNA-encoded genes/proteins of 16 chlorophytes and nine streptophytes showed that Pedinomonas is nested in the Chlorellales, a group of algae lacking ﬂagella. This phylogenetic conclusion is independently supported by uniquely shared gene linkages. We hypothesize that chlorellalean and pedinomonadalean green algae are reduced forms of a distant biﬂagellate ancestor that might have also given rise to the other known trebouxiophycean lineages. Our structural cpDNA data suggest that the Chlorellales and Pedinomonadales represent a deep branch of core chlorophytes, strengthening the notion that the Trebouxiophyceae emerged before the Ulvophyceae and Chlorophyceae. Our results further emphasize the importance of secondary reduction at both the cellular and genome levels during chlorophyte evolution.


Introduction
The Chlorophyta constitutes a morphologically and ecologically diverse assemblage of green algae. This major lineage, together with that including all land plants and their charophycean green algal relatives (Streptophyta), forms the Viridiplantae (Lewis and McCourt 2004;Pröschold and Leliaert 2007). Most chlorophytes have been divided into four classes (Prasinophyceae, Ulvophyceae, Trebouxiophyceae, and Chlorophyceae) based on the ultrastructure of the flagellar apparatus and cytokinesis during mitosis (Mattox and Stewart 1984). The afflux of molecular data, in particular nuclear-encoded 18S rRNA gene sequences, has improved considerably our understanding of the relationships among chlorophytes and prompted taxonomic revision. Trees inferred from 18S rDNA data have uncovered a number of monophyletic groups within each class but have been unable to unravel the branching orders of these lineages as well as the divergence order of the classes (Lewis and McCourt 2004;Pröschold and Leliaert 2007). Nonetheless, they have robustly resolved the Prasinophyceae as the earliest divergences of the Chlorophyta (Steinkötter et al. 1994;Nakayama et al. 1998;Fawley et al. 2000;Guillou et al. 2004), thus strongly supporting the long-held hypothesis that scaly flagellates are the ancestors of the core chlorophytes (Chlorodendrales, Prasinophyceae þ Ulvophyceae þ Trebouxiophyceae þ Chlorophyceae) (Mattox and Stewart 1984).
Phylogenomic approaches, such as the comparative analysis of whole chloroplast genomes, have proven valuable for inferring relationships among photosynthetic eukaryotes at deep levels (Martin et al. 1998;Qiu et al. 2006;Jansen et al. 2007;Lemieux et al. 2007;Rogers et al. 2007;Turmel et al. 2008Turmel et al. , 2009). In addition to providing a large suite of genes for phylogenetic analyses, chloroplast genomes offer genomic structural features for validation of tree topologies Lemieux et al. 2007). To date, the complete chloroplast genome sequences of 14 chlorophytes and 6 streptophyte green algae have been reported (see references in the Materials and Methods) and their comparisons have revealed that chlorophyte chloroplast DNAs (cpDNAs) display much more variability at the levels of size, gene content, intron content, and gene order than their streptophyte counterparts. The genomes of the prasinophycean green algae Nephroselmis olivacea  and Pyramimonas parkeae (Turmel et al. 2009) are those most closely resembling their streptophyte counterparts. Trees inferred from the concatenated sequences of 70 chloroplast protein-coding genes and their predicted proteins have recently uncovered the existence of a sister relationships between the prasinophycean orders Pyramimonadales and Mamiellales and have also shown that the euglenids, a group of photosynthetic eukaryotes belonging to the Excavata, acquired their chloroplasts secondarily from a member of the Pyramimonadales (Turmel et al. 2009). Earlier chloroplast phylogenetic analyses in which representatives of the five main lineages recognized in the Chlorophyceae were sampled had revealed the dichotomy of this class: One major clade comprises the Chlamydomonadales and Sphaeropleales, whereas the other contains the Oedogoniales, Chaetopeltidales, and Chaetophorales . In both investigations, phylogenetic conclusions received strong support from comparisons of chloroplast genomic features.
In the present cpDNA study, we sampled members of the Pedinomonadales (also designated as Pedinophyceae) and of the Chlorellales, a major lineage of the Trebouxiophyceae. Pedinomonadalean green algae have no clear affiliation with other groups of chlorophytes (Pickett-Heaps and Ott 1974;Melkonian 1990;Moestrup 1991). Representing only three genera (Pedinomonas, Resultor, and Marsupiomonas), they consist of small, naked flagellates lacking scales on their unique flagellum. The presence of two basal bodies, each associated with two microtubular roots, suggests that they are not primarily uniflagellates (Melkonian 1990). Although the persistent telophase spindle during mitosis and the eye-spot located opposite the flagellar insertion have been considered to be ancient features shared with some prasinophyceans, an affinity with the Ulvophyceae has been proposed based on the configuration of the flagellar apparatus and the structure of the stellate pattern of the flagellar transition region (Melkonian 1990;Moestrup 1991).
Members of the Trebouxiophyceae exhibit a high degree of morphological heterogeneity (Lewis and McCourt 2004) and form at least six distinct monophyletic groups in 18S rDNA trees (Krienitz et al. 2003(Krienitz et al. , 2004Henley et al. 2004;Karsten et al. 2005;Eliáš et al. 2008;Sluiman et al. 2008). Just three trebouxiophycean chloroplast genomes, all lacking a large inverted repeat (IR) encoding the rRNA operon, have been fully sequenced thus far: the miniature genome of the parasitic Helicosporidium sp. (Chlorellales, 37,454 pb) (de Koning and Keeling 2006) and those of the photosynthetic algae Chlorella vulgaris (Chlorellales, 150,613 bp) (Wakasugi et al. 1997) and Leptosira terrestris (incertae sedis, 195,081 bp) (de Cambiaire et al. 2007). Although the latter genomes have similar gene contents, they differ considerably in gene density, gene order, and intron content (de Cambiaire et al. 2007). The Chlorella genome has retained many of the gene clusters found in streptophytes, prasinophyceans, and ulvophyceans, but the Leptosira genome shares little similarity in gene order with the other green algal chloroplast genomes examined to date.
We report here the chloroplast genome sequences of Pedinomonas minor and of two trebouxiophyceans belonging to the Chlorellales, Parachlorella kessleri and Oocystis solitaria. The latter green algae represent the two major monophyletic groups recognized in the Chlorellales, that is, the families Chlorellaceae and Oocystaceae (Krienitz et al. 2003). Parachlorella and Chlorella occupy distinct clades within the Chlorellaceae (Krienitz et al. 2004). We found that the three newly investigated genomes carry an IR and form a coherent group with respect to their architecture. The tightly packed Pedinomonas genome has retained the highest degree of ancestral gene linkages, a trait also demarcating this alga from all other core chlorophytes examined thus far. In congruence with our analyses of genomic structural features, phylogenies inferred from multiple chloroplast genes/proteins affiliated Pedinomonas with the Chlorellales. This unexpected alliance has important implications for the evolution of chlorophytes.

Strains and Culture Conditions
Pedinomonas minor (UTEX LB 1350) originated from the Culture Collection of Algae at the University of Texas at Austin University and was rendered axenic by isolating individual algal cells on an agar plate with the aid of a tungsten needle. This strain was cultured in modified Volvox medium (McCracken et al. 1980). Strains of O. solitaria (SAG 83.80) and P. kessleri (SAG 211-11g) were obtained from the Sammlung von Algenkulturen Göttingen and cultured in C medium (Andersen et al. 2005). All cultures were subjected to alternating 12-h light-dark periods.

Cloning and Sequencing of Chloroplast Genomes
For the procedures used to sequence the Pedinomonas chloroplast genome, the reader should consult the report describing the mitochondrial genome of this chlorophyte . Because these organelle genomes copurified in CsCl-bisbenzimide gradients, they were decrypted in the course of the same sequencing project. The Oocystis and Parachlorella chloroplast genome sequences were determined essentially as reported by , except for the following modifications concerning the sizes of the fragments selected for cloning, the choice of cloning vector, the preparation of DNA templates, and the software for sequence assembly. Fragments of 2000-4000 bp were cloned into pSMART-HCKan (Lucigen Corporation, Middleton, WI). After hybridization of the resulting clones with the original DNA used for cloning, plasmids from positive clones were amplified with the Illustra TempliPhi Amplification Kit (GE Healthcare, Baie d'Urfé, Canada) to produce templates for sequencing reactions. The DNA sequences generated in these reactions were assembled using SEQUENCHER 4.8 (Gene Codes Corporation, Ann Arbor, MI).

Chloroplast Genome Analyses
Genes and open reading frames (ORFs) were identified as described previously . Secondary structures of group I and group II introns were modeled according to Michel et al. (1989) and Michel and Westhof (1990), respectively. For each genome, the regions containing nonoverlapping repeated elements were mapped with RepeatMasker (http://www.repeatmasker.org/) running under the WU-Blast 2.0 search engine (http://blast.wustl. edu/), using the repeats ! 30 bp identified with REPuter 2.74 (Kurtz et al. 2001) as input sequences. Conserved clusters of genes with identical polarities in selected chloroplast genomes were identified using a custom-built program.
The data sets were allowed to contain missing data; however, limitations were imposed to the proportion of missing data by selecting for analysis the protein-coding genes that are shared by at least 16 taxa. Seventy-one genes met this criterion: accD, atpA, B, E, F, H, I, ccsA, cemA, chlB, I, L, N, clpP, ftsH, infA, petA, B, D, G, L, psaA, B, C, I, J, M, psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z, rbcL, rpl2, 5, 14, 16, 20, 23, 32, 36, rpoA, B, C1, C2, rps2, 3, 4, 7, 8, 9, 11, 12, 14, 18, 19, tufA, ycf1, 3, 4, 12. The amino acid data set was prepared as follows: The deduced amino acid sequences from individual genes were aligned using MUSCLE 3.7 (Edgar 2004); the ambiguously aligned regions in each alignment were removed using GBLOCKS 0.91b (Castresana 2000) with the -b2 option (minimal number of sequences for a flank position) set to 15, and the protein alignments were concatenated. To obtain the nucleotide data set, the multiple sequence alignment of each protein was converted into a codon alignment, the poorly aligned and divergent regions in each codon alignment were excluded using GBLOCKS 0.91b with the options -b2 5 15 and -t 5 c (the latter specifying that selected sequences are complete codons), the individual codon alignments were concatenated, and finally third codonpositionswereexcludedwithPAUP*4.0b10 (Swofford 2003). Missing characters represented 5.2% and 5.1% of the amino acid and nucleotide data sets, respectively.
Phylogenetic inferences were carried out using the maximum likelihood (ML) and Bayesian methods. Treefinder (version of April 2008) (Jobb et al. 2004) was used to perform the ML analyses of both data sets and to identify the best models fitting the data under the Akaike information criterion. ML trees were inferred from the amino acid data set using the LG model with the observed amino acid frequencies þ a gamma distribution of rates across sites with eight categories þ invariable sites (LG þ F þ C þ I).
For the analysis of the nucleotide data set, we used the general time reversible model (GTR) with a gamma distribution (eight categories) þ invariable sites (GTR þ C þ I).
Confidence of branch points was estimated by 100 bootstrap replications.
PhyloBayes 2.3c (http://www.lirmm.fr/mab/article. php3?id_article5329) analysis of the amino acid data set was performed with the category amino acid siteheterogeneous mixture (CAT) model þ a gamma distribution (eight categories) (CAT þ C). Two independent runs were carried out with default PhyloBayes conditions until a maxdiff value , 0.15 was achieved to ensure chain equilibration (30,000 generations). The first 4,500 points were discarded as burn-in, and the posterior consensus was computed on the remaining trees. The nucleotide data were analyzed with MrBayes 3.1.2 (Ronquist and Huelsenbeck 2003) using the GTR þ C (eight categories) þ I model. Two independent Markov chain Monte Carlo runs, each consisting of three heated chains in addition to the cold chain, were carried out using the default parameters. The length of each run was 3 million generations after a burn-in phase of 500,000 generations. Trees were sampled every 100 generations. Convergence of the two independent runs was verified according to the output of the ''sump'' command; this output was also used to determine the burn-in phase. Posterior probability values were estimated from all the sampled trees using the ''sumt'' command.

Results
The Pedinomonas, Parachlorella, and Oocystis Chloroplast Genomes Feature a Similar Quadripartite Structure Circular maps encompassing 98,340 and 123,994 bp were assembled for the cpDNAs of Pedinomonas and Parachlorella, respectively (figs. 1 and 2). With regards to the Oocystis chloroplast genome, the collection of plasmid clones and polymerase chain reaction (PCR) fragments Shared Ancestry between the Pedinomonadales and Chlorellales 2319 examined yielded a single linear contig of 96,287 bp featuring virtually all of the genes previously identified in Chlorella cpDNA ( fig. 3). A segment of 5 kbp at one of the termini of this contig harbors large ORFs sharing no homology with known DNA sequences. As observed for all chloroplast genomes fully sequenced thus far, the map of the Oocystis cpDNA is almost certainly circular. However, PCR analyses using various sets of primers did not enable us to circularize the map shown in figure 3.
When one glances at the maps of the Pedinomonas and Parachlorella genomes (figs. 1 and 2), the most striking similarities that can be discerned include the large rRNA operon-encoding IR that divides each genome into small and large single-copy (SSC and LSC) regions, the relatively high density of coding sequences (table 1), and the paucity of introns (table 1). The genes encoded by the two genomes are similarly partitioned among the three main genomic regions and the direction of transcription of the rRNA operon is identical, that is, oriented toward the SSC region as in virtually all IR-containing cpDNAs of streptophytes (Palmer 1991;Raubeson and Jansen 2005) and those of the prasinophyceans Nephroselmis and Pyramimonas Turmel et al. 2009). The gene partitioning pattern matches that inferred for the common ancestor of chlorophytes and streptophytes , with the exception of a few contiguous genes that were transferred from the LSC region to a location immediately downstream of the rRNA operon (the genes marked with arrows in figs. 1 and 2).
The sequence data for the Oocystis genome also uncovered the presence of an IR, but only the boundaries of this sequence with the LSC region were localized ( fig. 3). None of the clones downstream of the rRNA operon revealed divergent sequences indicative of IR-SSC junctions. The SSC region is likely to contain the missing stretch of the Oocystis genome sequence because this FIG. 1.-Gene map of Pedinomonas cpDNA. The two copies of the IR sequence are represented by thick lines. Coding sequences (filled boxes) on the outside of the map are transcribed in a clockwise direction. The ORFs containing more than 100 codons are shown in gray; none of these ORFs revealed detectable similarity with known gene sequences. The color code denotes the genomic regions containing the corresponding genes in the cpDNAs of streptophytes and those of Nephroselmis and Pyramimonas: magenta, SSC; cyan, LSC; and yellow, IR. Given the variable gene content of the IR in these ancestral-type genomes, only the genes invariably present in this region (i.e., those encoded by the rRNA operon) were represented in yellow. The positions of the genes marked with arrows do not conform to the ancestral quadripartite structure. The gene clusters denoted by the brackets have been previously observed in streptophytes but not in chlorophytes. The tRNA genes are indicated by the one-letter amino acid code (Me, elongator methionine; Mf, initiator methionine) followed by the anticodon in parentheses. segment lies at the proximity of a suite of 11 genes usually found in the SSC domain in ancestral-type genomes. Even though the IR-SSC boundaries were not identified, the currently available data clearly indicate that the Oocystis genome features an ancestral-type quadripartite architecture, is intron-poor, and is more densely packed than its Parachlorella, Chlorella, and Pedinomonas counterparts (table 1). In Oocystis, the sole departures from the ancestral quadripartite architecture include the migration of ycf20, a gene typically located in the SSC region, to a locus near one of the LSC-IR boundaries and the finding that one copy of the duplicated trnN(guu), a gene also characteristic of the SSC region, lies in the LSC region ( fig. 3). The second copy of trnN(guu) resides immediately downstream of the rRNA operon next to the other genes usually found in the SSC partition. Considering that the relocalized ycf20 and trnN(guu) are far apart in the Oocystis genome, there is little evidence that these genes were cotransferred to the LSC region.
Slight modifications from the ancestral quadripartite structure have also been observed for the prasinophyceans Nephroselmis and Pyramimonas Turmel et al. 2009). In contrast, the IRcontaining genomes of the ulvophyceans (Pombert et al. 2005(Pombert et al. , 2006 and chlorophyceans (Maul et al. 2002;de Cambiaire et al. 2006;Brouard et al. 2008 substantially from the ancestral pattern, implying that numerous genes were exchanged between opposite sides of the IR and that the rRNA operon itself might have undergone inversion during this process. Gene shuffling, however, was more extensive in chlorophycean lineages than in the lineages that led to Ostreococcus and the ulvophyceans Oltmannsiellopsis and Pseudendoclonium. The moderate changes observed for the latter genomes allowed inferences to be made regarding the direction of gene movements (Pombert et al. 2005(Pombert et al. , 2006. The Pedinomonas Genome Contains a Subset of the Conserved Genes Found in Oocystis, Parachlorella, and Chlorella cpDNAs Pedinomonas cpDNA encodes 105 conserved genes and is one of the smallest chloroplast genomes reported for photosynthetic chlorophytes. Only the completely sequenced chloroplast genomes of the prasinophyceans Ostreococcus (88 genes, 71,666 bp) and Pycnococcus (98 genes, 80,211 bp) are known to be smaller than that of Pedinomonas (Robbens et al. 2007;Turmel et al. 2009). On the basis of gene content data alone, it is hard to predict which of the core chlorophyte chloroplast genomes examined to date is the most closely related to Pedinomonas cpDNA. Although the loosely packed genomes of the trebouxiophycean Leptosira (195,081 bp, 106 conserved genes) and of the ulvophyceans Oltmannsiellopsis (151,933 bp, 104 genes) and Pseudendoclonium (195,867 bp, 105 genes) most closely resemble their Pedinomonas counterpart in terms of gene number, they lack a number of genes that are present in the Pedinomonas chloroplast (e.g., ycf47, cysA, cysT, trnS(gga), and trnT(ggu) in the case of ulvophyceans) and also differ by the presence of genes that are lacking in the pedinomonadalean organelle (table 2). Interestingly, the chloroplast gene repertoire of Pedinomonas overlaps almost entirely with those of FIG. 3.-Gene map of Oocystis cpDNA. This genome carries an IR whose boundaries with the SSC region remains unknown; the IR presence is supported by the divergence pattern observed for the sequences mapping downstream of trnMe(cau). Coding sequences (filled boxes) on the top of the linear map are transcribed in a clockwise direction. The ORFs containing more than 100 codons are shown in gray; only the ORF embedded in the psbM intron (open box) revealed detectable similarity with known gene sequences. This ORF, which is located within domain IV of the intron secondary structure, carries the reverse transcriptase (cd01651) and maturase (pfam08388) domains, but not the endonuclease domain, of reverse transcriptases encoded by group II introns. To our knowledge, the insertion site of the intron in the Oocystis psbM has not been documented previously. For other details, consult the legend of figure 1. 2322 Turmel et al. Oocystis, Parachlorella, and Chlorella (table 2). With 110 conserved genes in common, the gene complements of the three chlorellalean algae are nearly identical. The 113-gene repertoire of Chlorella differs from that of Parachlorella (112) by the presence of trnR(ccg) and from that of Oocystis (minimum of 111 genes) by the presence of trnL(caa) and trnL(gag). All Pedinomonas genes are included in the Chlorella and Oocystis gene complements, with just trnR(ccg) missing from the Parachlorella genome. In contrast, six of the Pedinomonas genes have no homolog in the chloroplast of the trebouxiophycean Leptosira (table 2).

Chloroplast Phylogenomic Analyses Robustly Affiliate
Pedinomonas with the Chlorellales To gain an insight into the positions of the Pedinomonadales and Trebouxiophyceae within the Chlorophyta, we generated data sets of 71 concatenated proteins and genes (first and second codon positions) from the completely sequenced chloroplast genomes of 16 chlorophytes and nine streptophytes and analyzed them using the ML and Bayesian methods (fig. 4). The protein trees showed with strong support that Pedinomonas is embedded within the clade    , rbcL, rpl2, 5, 12, 14, 16, 19, 20, 23, 32, 36, rpoA, B, C1, C2, rps2, 3, 4, 7, 8, 9, 11, 12, 14, 18, 19, rrf, rrl, rrs, tufA, ycf1, 3, 4, 20, trnA(ugc) The representative of the Oocystaceae, Oocystis, represents the first branch of this clade, whereas Pedinomonas is sister to the clade uniting the two representatives of the Chlorellaceae (Chlorella and Parachlorella). Although the gene trees identified the same clade and an identical divergence order for the four lineages, weaker support was observed for the node uniting all four algae as well as that uniting Pedinomonas with Parachlorella and Chlorella. Both the protein and gene trees inferred from the chloroplast data positioned the trebouxiophycean Leptosira as sister to the clade containing the four representatives of the Chlorophyceae ( fig. 4), implying that the Trebouxiophyceae are nonmonophyletic. However, given the long branch formed by Leptosira (incertae sedis) and also the fact that this green alga is the sole trebouxiophycean representative outside the Chlorellales, this sister relationship may be the result of phylogenetic reconstruction artifacts. Consistent with the hypothesis that taxon sampling has an influence on the position of Leptosira, a weakly supported clade clustering Leptosira and Chlorella was recovered in recent chloroplast phylogenomic analyses (Turmel et al. 2009).

Analysis of Gene Order Reveals an Ancestral Genome Organization in Pedinomonas and a Close Alliance between the Pedinomonadales and Chlorellales
Because phylogenomic analyses are inherently associated with sparse taxon sampling, they can lead to trees robustly supporting an artifactual clustering of taxa (Brinkmann and Philippe 2008;Heath et al. 2008). However, the use of complete genome sequences in these analyses offers the opportunity to validate topologies by analyzing structural genomic features (Rokas 2006). Considering that gene order analysis has proven to be a promising approach for testing phylogenies inferred from green plant chloroplast genomes Lemieux et al. 2007), gene orders in the three newly sequenced chlorophyte chloroplast genomes were compared with one another and with those in the previously examined chlorophytes and the streptophytes Mesostigma and Chlorokybus. In addition to providing information on similarity of global gene organization, these comparative analyses allowed us to examine the level of retention of the ancestral gene clusters that predate the divergence of chlorophytes and streptophytes and also to track the gene clusters that arose following the emergence of the Chlorophyta.
Collectively, the Pedinomonas and chlorellalean genomes share 16 blocks of colinear sequences encoding 65% of the genes (67/104) common to these genomes (table 3). The genomes having the highest similarity in overall gene order are those of Parachlorella and Chlorella (Chlorellaceae): Seventeen synthetic blocks encoding up to 20 genes, for a total of 94 genes (i.e., 83% of the genes in each genome) were identified in these genomes. Although the Pedinomonas genome was found to more closely resemble the Oocystis genome, with 17 synthetic blocks encoding 79 of its 105 genes (i.e., 75% of the genome), the extent of similarity with the Parachlorella genome was only marginally different (17 synthetic blocks encoding 77 genes, i.e., 73% of the genome). These relative similarities in gene order are in accordance with the genetic distances separating the Pedinomonas chloroplast from its Oocystis and Parachlorella homologs in the phylogenomic trees reported in this study ( fig. 4). The pairwise comparison of the Pedinomonas and Chlorella genomes, however, revealed less similarity in gene organization (68%) than expected, with a deficit of six genes (71 genes) in the 17 observed synthetic blocks. As reported for other IR-lacking chloroplast genomes (Palmer et al. 1987;Strauss et al. 1988;Turmel et al. 2005;Bélanger et al. 2006;de Cambiaire et al. 2007), loss of the IR in Chlorella cpDNA was most likely accompanied by the breakdowns and rearrangements of a number of gene blocks (see figs. 5 and 6). Of the four green algae found in the Pedinomonadales þ Chlorellales clade ( fig. 4), Pedinomonas features the most ancestral chloroplast gene organization, a trait that also demarcates this green alga from all other core chlorophytes examined thus far. Figure 5 illustrates, for a sample of core chlorophytes, the chloroplast gene clusters that were inherited from the common ancestor of chlorophytes and streptophytes. These data add support to previous studies suggesting that ancestral clusters of this type sustained less erosion in chlorellalean green algae than in ulvophyceans and chlorophyceans (Pombert et al. 2005(Pombert et al. , 2006Brouard et al. 2008). Pedinomonas resembles its three chlorellalean counterparts in exhibiting high retention of ancestral clusters; 11 of the 16 conserved blocks of colinear sequences shared by all four core chlorophytes exhibit gene linkages that predate the divergence of chlorophytes and streptophytes (table 3). Remarkably, the level of retention observed for Pedinomonas is higher than those observed for chlorellaleans (fig. 5) and prasinophyceans . For example, the nine-gene cluster containing rpo and atp genes (trnC(gca)-rpoB-rpoC1-rpoC2-rps2-atpI-atpH-atpF-atpA) has remained intact in Pedinomonas but was disrupted at one or more sites in the lineages leading to all previously examined chlorophytes. On the basis of the breakpoints found in prasinophyceans and the euglenid Euglena gracilis (Turmel et al. 2009), preservation of this ancestral suite of genes was predictable in some early-diverging lineages of chlorophytes. Like Pyramimonas (Turmel et al. 2009), Pedinomonas has maintained two additional ancestral arrangements that were found to be fragmented in the lineages leading to all other core chlorophytes: clpP-psbB-psbT-psbN-psbH-petB-petD and trnR(ccg)-rbcL-atpB-atpE ( fig. 5). In Pedinomonas, the clpP cluster is contiguous with the ancestral ribosomal protein gene cluster spanning rpl23 to rpl12 ( fig. 1), forming a super cluster whose presence has been documented in streptophytes but not in chlorophytes. Numerous gene linkages shared by Pedinomonas and/ or chlorellalean chloroplast genomes arose during chlorophyte evolution ( fig. 6). Because of the limited number of chlorophyte lineages investigated so far and the uncertainty regarding the branching pattern of core chlorophytes, it is difficult to infer the timings of emergence of these gene linkages. A subset of the chlorophyte-specific gene linkages found in Pedinomonas and/or chlorellaleans have been observed in Leptosira and/or other major groups of chlorophytes, indicating that they were inherited from a distant ancestor (often a prasinophycean). The set of 11 gene linkages restricted to Pedinomonas and one or more chlorellaleans represents putative synapomorphies that support a shared ancestry of the Chlorellales and Pedinomonadales. On the other hand, the 10 gene linkages unique to Parachlorella and Chlorella are consistent with the notions that these green algae belong to the same monophyletic group (Chlorellaceae) and display the most similar chloroplast gene arrangements among the members of the Chlorellales þ Pedinomonadales clade examined so far.
Overall, the gene order analyses reported above are in total agreement with our phylogenomic inferences in showing that the Pedinomonas chloroplast genome and its three chlorellalean homologs form a coherent group featuring gene linkages not previously identified in other core chlorophytes. Besides the specific gene linkages uniting these genomes, it is worth mentioning that all three IR-containing cpDNAs display the trnD(guc) and trnMe(cau) genes in their IR and that both the Pedinomonas and Oocystis IRs include as well the trnG(gcc)-trnS(uga) pair. These tRNA genes appear to be unique to the IRs of these algae and thus are likely to contribute additional synapomorphies supporting the Chlorellales þ Pedinomonadales clade.

Evolutionary Implications of the Inferred Alliance Between the Pedinomonadales and Chlorellales
Our chloroplast phylogenomic analyses suggest that the small group of uniflagellates making up the Pedinomonadales represents a chlorellalean lineage distinct from the Oocystaceae and Chlorellaceae (fig. 4). The concept that the Pedinomonadales are allied with the Chlorellales or more generally with the Trebouxiophyceae was unanticipated. Pedinomonadalean green algae have been regarded to have an affinity to the Prasinophyceae because they share morphological features with some members of this ancient group of algae, namely, a persistent telophase spindle during mitosis and an eye-spot opposite the flagella and in the cleavage plane of the cell during cytokinesis (Melkonian 1990;Moestrup 1991). On the other hand, derived ultrastructural characters of the flagellar apparatus, more specifically the counterclockwise orientation of basal bodies and the cruciate flagellar root system, support the notion that pedinomonadaleans evolved after the divergence of the prasinophycean lineages but do not belong to the Chlorophyceae. Although the combination of the abovementioned morphological characters led Melkonian (1990) to believe that the Pedinomonadales are related to the Ulvophyceae, he proposed that this group be treated Table 3 Conserved Gene Clusters in the Pedinomonas, Oocystis, Parachlorella, and Chlorella Chloroplast Genomes accD-cysA cysT-ycf1 petB-petD psaA-psaB psbD-psbC trnF(gaa)/ycf47 petA-petL-petG psbK-ycf12-psaM atpB-atpE-rps4-trnS(gga) psbB-psbT/psbN/psbH psbE-psbF-psbL-psbJ trnC(gca)/rpoB-rpoC1-rpoC2 rps2-atpI-atpH-atpF-atpA rrs-trnI(gau)-trnA(ugc)-rrl-rrf rpl20-rps18/trnW(cca)-trnP(ugg)/psaJ-rps12-rps7-tufA-rpl19-ycf4 rpl23-rpl2-rps19-rps3-rpl16-rpl14-rpl5-rps8-rpl36-rps11-rpoA-rps9-rpl12 Underlined are the gene linkages that arose during the evolution of chlorophytes ( fig. 6); all other gene linkages predate the divergence of the Chlorophyta and Streptophyta ( fig. 5). Genes separated by a dash are encoded on the same DNA strand, whereas those separated by a diagonal bar are encoded on different strands. Shared Ancestry between the Pedinomonadales and Chlorellales 2325 as a chlorophyte order of uncertain affinity. It should be noted that the motile cells of trebouxiophyceans also display a counterclockwise orientation of basal bodies and a cruciate flagellar root system; however, in contrast to ulvophyceans, a nonpersistent telophase spindle is generally observed during mitosis (Mattox and Stewart 1984).
Independent lines of evidence indicate that the chloroplast phylogenomic trees reported here reflect the true relationships among the members of the Chlorellales and Pedinomonadales. First, the chloroplast phylogenomic analyses are congruent with published phylogenies inferred from 18S rDNA sequences in recovering a clade that com-prises representatives of the Oocystaceae and Chlorellaceae and in resolving these two families of the Chlorellales as separate lineages (Krienitz et al. 2003(Krienitz et al. , 2004Henley et al. 2004;Sluiman et al. 2008). However, considering that no 18S rRNA gene sequence has been reported for a pedinomonadalean taxon, it remains to be seen whether the inclusion of the Pedinomonadales within the Chlorellales is in agreement with nuclear data. Second, consistent with the notion that the Pedinomonadales and Chlorellales descend from a common ancestor, we have uncovered in the present study structural cpDNA features (gene linkages and gene content of IR) that are uniquely shared by these FIG. 5.-Conserved gene clusters predating the divergence of chlorophytes and streptophytes in Pedinomonas, trebouxiophycean, ulvophycean, and chlorophycean cpDNAs. These clusters were defined as those containing genes in the same order and polarity in at least one streptophyte (either Mesostigma or Chlorokybus) and two chlorophytes. Note that the gene linkages marked with brackets (above the figure) are not observed in any of the five prasinophycean cpDNAs investigated to date, thus leaving open the possibility that they evolved independently in streptophyte and chlorophyte lineages. For each genome, the set of genes making up each of the identified clusters is shown as black boxes connected by a horizontal line. Black boxes that are contiguous but not linked together indicate that the corresponding genes are not adjacent on the genome. Gray boxes denote individual genes that have been relocated elsewhere on the chloroplast genome and empty boxes denote missing genes. The relative polarities of the genes are not shown; for this information, consult the gene maps presented in this study or previous reports.
2326 Turmel et al. green algae. Third, the possibility that the placement reported here for Pedinomonas results from systematic errors of phylogenetic reconstructions (Delsuc et al. 2005;Brinkmann and Philippe 2008;Verbruggen and Theriot 2008) is unlikely because this taxon remained nested in the Chlorellales in recent chloroplast phylogenomic analyses with a better sampling of trebouxiophycean lineages (Turmel M, Otis C, Lemieux C, unpublished results). In this context, it should be noted that we used the site-heterogeneous CAT model in our Bayesian analyses of the amino acid data set to overcome some of the risks of systematic errors (Lartillot and Philippe 2004). Models accommodating site-specific biochemical constraints in the amino acid replacement pattern at each position of the alignment have been shown to alleviate long-branch attraction artifacts Rodriguez-Ezpeleta et al. 2007).
The nature of the flagellate ancestor of the Chlorellales has remained ambiguous because all members of this group lack flagella (Lewis and McCourt 2004). The placement of the uniflagellate Pedinomonas within the Chlorellales indicates that the most recent common ancestor of the Pedinomonadales and Chlorellales was most likely a uniflagellate whose cell architecture was simplified through complete loss of the flagellar apparatus in the lineages leading to the Oocystaceae and Chlorellaceae. In the lineage leading to Pedinomonas, reduction of cell size was probably accompanied by loss of the cell wall. Given the relative positions of the chlorellalean and pedinomonadalean lineages in the chloroplast trees, we infer that the flagellar apparatus was independently lost in the Oocystaceae and Chlorellaceae. A multi-layered cell wall, with the crystalline cellulose fibrils in each layer being perpendicular to the orientation found in adjoining layers, distinguishes the oocystaceans (Hepperle et al. 2000) from the chlorellaceans, suggesting that this trait arose after the divergence of these families. In contrast, as in other trebouxiophycean lineages, chlorellaceans feature a trilaminate cell wall with an outer electron-dense layer indicative of sporopollenin (Krienitz et al. 1999;Henley et al. 2004). Interestingly, it was hypothesized that pedinomonadaleans are not primarily uniflagellates based on the presence of a second nonfunctional basal body in some members of the group (Melkonian 1990), implying that the uniflagellate ancestor of the Pedinomonadales and Chlorellales might have originated from a biflagellate related to the trebouxiophyceans that have retained the flagellar apparatus.

Patterns of Chloroplast Genome Evolution in the Chlorellales þ Pedinomonadales Clade
The IR-containing cpDNAs of Pedinomonas, Oocystis, and Parachlorella resemble one another in overall structure, gene content, and intron content, indicating that the chloroplast genome evolved rather conservatively in the Chlorellales þ Pedinomonadales clade. However, relative to their homologs in previously investigated core chlorophytes (ulvophyceans and chlorophyceans), they stand apart by their ancestral gene organization. Their quadripartite structures show striking similarities to that inferred for the common ancestor of chlorophytes and streptophytes. Moreover, their gene repertoires exhibit a few genes (ycf47, cysA, cysT, trnS(gga), trnT(ggu)) that are missing from both their ulvophycean and chlorophycean homologs but are present in the ancestral-type genome of the prasinophycean Nephroselmis. The higher degree of ancestral character observed for the gene partitioning pattern and gene FIG. 6.-Conserved gene pairs in core chlorophyte cpDNAs that originated after the emergence of the Chlorophyta. In this category were classified the gene pairs absent in both Mesostigma and Chlorokybus and present in at least two core chlorophytes belonging to the same lineage (Trebouxiophyceae, Ulvophyceae, and Chlorophyceae). For each gene pair, adjoining termini of the genes are indicated. Filled boxes indicate the presence of gene pairs, whereas gray or open boxes indicate the absence of gene pairs. A gray box indicates that the two genes associated with a gene pair are found in the genome but are unlinked. An open box indicates that one or both genes associated with a gene pair are absent from the genome. Gene pairs linked by brackets are contiguous on the genome. Shared Ancestry between the Pedinomonadales and Chlorellales 2327 content of the Pedinomonas, Oocystis, and Parachlorella genomes relative to those of other core chlorophyte genomes might be indicative of an early origin of the Chlorellales þ Pedinomonadales clade, which would be consistent with the view of Moestrup (1991) regarding the Pedinophyceae and with the hypothesis that the Ulvophyceae are sister to the Chlorophyceae. Previously reported analyses of chloroplast gene order and gene content together with mitochondrial phylogenomic analyses with poor taxon sampling also favored the view that the Chlorellales represents an early-diverging lineage among the core chlorophytes (Pombert et al. 2004). However, the chloroplast phylogenomic analyses presented in this study and other reports Turmel et al. 2008Turmel et al. , 2009) are inconclusive with regards to the relative positions of the Trebouxiophyceae, Ulvophyceae, and Chlorophyceae. Phylogenies incorporating taxa representing the main lineages recognized in the Trebouxiophyceae and additional chlorophyte lineages will be required to resolve the issue regarding the monophyletic nature of the Trebouxiophyceae and to unravel the position of this algal group in the radiation of core chlorophytes.
The slight deviations from the ancestral quadripartite structure that we observed in the Pedinomonas, Oocystis, and Parachlorella cpDNAs are associated with distinct sets of genes, indicating that independent events led to the transfer of these genes from one side of the rRNA operon to the other. Considering that all relocalized genes, with the exception of the duplicated Oocystis trnN(guu), are present within or near the IR and that the contiguous trnG(gcc) and trnS(uga) genes downstream of the rRNA operon in the Pedinomonas genome are found immediately upstream of the same operon in the Oocystis genome, rearrangements of the IR itself, via expansion-contraction events and inversions of two overlapping internal segments containing the entire rRNA operon, might account for these gene transfers. A model involving complex rearrangements of this type has been recently proposed to explain the considerable differences observed for the quadripartite structure of the 218-kb Pelargonium chloroplast genome relative to other land plant cpDNAs (Chumley et al. 2006). Like the cpDNA of the ulvophycean Pseudendoclonium (Pombert et al. 2005), this unusually large angiosperm genome exhibits an IR in which the rRNA operon is transcribed toward the LSC region; to account for this change in orientation, a single inversion was inferred for the region containing the rRNA operon (Chumley et al. 2006).
The compactness of the Pedinomonas chloroplast and mitochondrial genomes lends credit to the notion that tiny unicellular organisms tend to exhibit small genomes. As previously observed for the lineages leading to the prasinophycean coccoids Ostreococcus and Pycnococcus (Robbens et al. 2007;Turmel et al. 2009), the chloroplast genome was reduced in gene content and constrained to maintain a compact gene organization in the Pedinomonas lineage; however, a smaller proportion of conserved genes are missing from Pedinomonas cpDNA compared with its prasinophycean homologs. With 105 conserved genes, the 98-kb cpDNA of Pedinomonas lacks eight of the 113 genes predicted to have been present in the chloroplast of the common ancestor of the Pedinomonadales and Chlorellales (table 2), whereas the 72-kb IR-containing cpDNA of Ostreococcus and the 80-kb IR-less cpDNA of Pycnococcus encode 88 and 98 conserved genes, respectively. Because the nuclear genome of a pedinomonadalean green alga has not been decrypted, the chloroplast genes missing in Pedinomonas cannot be unequivocally interpreted in terms of gene transfers from the chloroplast to the nucleus. The infA, tilS, and trnI(cau) genes, whose products play a role in protein synthesis, probably sustained such transfers; however, some might have been definitely extinguished because their presence is dispensable (e.g., trnL(caa) and trnL(gag)) or their requirement is restricted to certain growth and physiological conditions (e.g., the chlB, chlL, and chlN genes associated with chlorophyll synthesis in the dark). Finally, another feature distinguishing the Pedinomonas chloroplast genome from the Ostreococcus and Pycnococcus cpDNAs is the high level of retention of ancestral gene clusters ( fig. 5). Of all the chloroplast genomes of core chlorophytes examined thus far, Pedinomonas cpDNA exhibits the most ancestral gene order.
Cell simplification and reduction in cell size had more influence on the mitochondrial genome than on the chloroplast genome in the Pedinomonas lineage, but the reverse situation has been found for the Ostreococcus lineage (Robbens et al. 2007). Streamlining of the Pedinomonas mitochondrial genome was severe and was accompanied by extensive genome rearrangements, the gain of a nonstandard codon (TGA coding for Trp), and acceleration of the rate of sequence evolution ). This evolutionary pattern, also characteristic of some chlorophycean mitochondrial DNAs (mtDNAs) (Gray et al. 1998(Gray et al. , 2004, contrasts sharply with the ancestral pattern observed for the mitochondrial genomes of Ostreococcus (Robbens et al. 2007), Nephroselmis , and the chlorellalean Prototheca wickerhamii (Wolff et al. 1994). Whereas the latter mtDNAs range from 44 to 55 kb in size and encode 61-65 genes, the 25-kb mtDNA of Pedinomonas specifies only 22 genes, all of which are tightly packed in a 16-kb segment and encoded by the same strand. Unlike the phylogenies reported here, those inferred from mitochondrial genes revealed that Pedinomonas and chlorophyceans form a highly supported clade that is sister to the ulvophyte Pseudendoclonium (Pombert et al. 2004). Given the long branch displayed by Pedinomonas and chlorophyceans in the mitochondrial trees, the incongruence of these trees with the chloroplast phylogenies is attributed to long-branch attraction artifacts.
In conclusion, the comparative chloroplast genome analyses presented in this study have unveiled the phylogenetic position of the Pedinomonadales and illuminated the evolutionary history of the core chlorophytes belonging to the Chlorellales. Our results add support to the notion that the Chlorellales þ Pedinomonadales clade emerge early during the radiation of core chlorophytes and highlight further the importance of secondary simplification at both the cellular and genome levels during chlorophyte evolution.

Supplementary Material
The data sets used in phylogenetic analyses are available at Molecular Biology and Evolution online 2328 Turmel et al. (http://mbe.oxfordjournals.org/). The fully annotated chloroplast genome sequences of Oocystis, Pedinomonas, and Parachlorella have been deposited in the GenBank database under the accession numbers FJ968739, FJ968740, and FJ968741, respectively. into the architecture of ancestral chloroplast genomes. Proc Natl Acad Sci USA. 96:10248-10253. Turmel M, Otis C, Lemieux C. 2002