-
PDF
- Split View
-
Views
-
Cite
Cite
Marcus Lechner, Walter Rossmanith, Roland K. Hartmann, Clemens Thölken, Bernard Gutmann, Philippe Giegé, Anthony Gobert, Distribution of Ribonucleoprotein and Protein-Only RNase P in Eukarya, Molecular Biology and Evolution, Volume 32, Issue 12, December 2015, Pages 3186–3193, https://doi.org/10.1093/molbev/msv187
Close - Share Icon Share
Abstract
RNase P is the endonuclease that removes 5′ leader sequences from tRNA precursors. In Eukarya, separate RNase P activities exist in the nucleus and mitochondria/plastids. Although all RNase P enzymes catalyze the same reaction, the different architectures found in Eukarya range from ribonucleoprotein (RNP) enzymes with a catalytic RNA and up to 10 protein subunits to single-subunit protein-only RNase P (PRORP) enzymes. Here, analysis of the phylogenetic distribution of RNP and PRORP enzymes in Eukarya revealed 1) a wealth of novel P RNAs in previously unexplored phylogenetic branches and 2) that PRORP enzymes are more widespread than previously appreciated, found in four of the five eukaryal supergroups, in the nuclei and/or organelles. Intriguingly, the occurrence of RNP RNase P and PRORP seems mutually exclusive in genetic compartments of modern Eukarya. Our comparative analysis provides a global picture of the evolution and diversification of RNase P throughout Eukarya.
RNase P is the endonuclease that removes 5′ leader sequences from tRNA precursors, an essential step in tRNA maturation (Lai et al. 2010; Liu and Altman 2010). The virtually ubiquitous enzyme independently originated at least twice in evolution with different architectures. Ribonucleoprotein (RNP) enzymes based on a catalytic RNA molecule (P RNA) represent the more ancient type that is found in all three domains of life. Although their RNA is structurally conserved, their protein partners are highly divergent with a single protein in Bacteria, 4–5 in Archaea, and up to 10 in eukaryal nuclei (Hartmann et al. 2009; Ellis and Brown 2010; Lai et al. 2010; Liu and Altman 2010; Walker et al. 2010). All known nuclear RNase P RNPs are composed of a P RNA of about 350 nt and a set of proteins, always including RPP21/RPR2, RPP29/POP4, RPP30/RPP1, POP5, POP1, RPP20/POP7, and RPP25/POP6 (Hartmann E and Hartmann RK 2003; Rosenblad et al. 2006; Walker et al. 2010). The reasons for the massive increase in the protein moiety of the enzyme in Eukarya as compared with Archaea or Bacteria are poorly understood and have been speculated to be related to added functionality of the eukaryal enzyme (Marvin and Engelke 2009a, 2009b; Jarrous and Gopalan 2010), although recent RNase P replacement experiments do not support such notion (Weber et al. 2014). Studies of the prevalence of nuclear RNP RNase P subunits in eukaryal genomes are complicated by the presence of a related RNP, RNase MRP, exclusively found in Eukarya and involved in 5.8S rRNA maturation. This RNP enzyme is composed of a structurally related, but nonetheless distinguishable RNA, and a largely overlapping set of proteins (Jarrous and Gopalan 2010; Walker et al. 2010). In fact, it appears that RPP21 is the only protein not shared by the two RNPs, but consistently specific to RNase P.
A fundamentally different type of RNase P is composed of protein only (PROteinaceous RNase P, PRORP) and appears confined to the eukaryal domain. In its simplest form it consists of a single 60-kDa protein, but requires additional subunits in some cases, for example, two other protein components in human mitochondrial RNase P (Holzmann et al. 2008; Gobert et al. 2010, 2013; Gutmann et al. 2012; Taschner et al. 2012; Pinker et al. 2013). The two kinds of RNase P are highly similar in terms of substrate and cleavage specificity, and they were even found to be functionally exchangeable in Escherichia coli and Saccharomyces cerevisiae (Gobert et al. 2010; Taschner et al. 2012; Weber et al. 2014).
The discovery of protein-only RNase P (PRORP) enzymes in Eukarya pointed out that the evolution of RNase P is more intriguing and complex than previously thought. Questions are raised as to when PRORP appeared during evolution, and if there may still be evolutionary traces of its coexistence with RNP RNase P within the same cellular compartment. Where and how did such a coexistence lead to the divergent specialization and compartmentalization of the different RNase P enzymes? Here, we analyze and compare the prevalence and architectural type of both RNP and PRORP enzymes in Eukarya. We find that PRORP enzymes are widespread among eukaryal lineages and propose reasonable scenarios for the evolution of RNase P in Eukarya.
Results and Discussion
Incidence of Nuclear Ribonucleoprotein RNase P
Here we update the distribution of P RNA and RPP21 (the protein subunit not found in RNase MRP) in eukaryal nuclear genomes, based on previously published studies (Hartmann E and Hartmann RK 2003; Marquez et al. 2005; Piccinelli et al. 2005; Rosenblad et al. 2006) and analyses of newly available genome data, to determine the prevalence of nuclear RNP RNase P in the different branches of Eukarya. The results are summarized in table 1 and the inventory detailed in supplementary table S1, Supplementary Material online (http://bioinf.pharmazie.uni-marburg.de/supplements/rnase_p_2015/ last accessed September 14, 2015). For example, we identified a variety of novel P RNAs including hitherto unexplored taxa.
Overview of the occurrence of RNP and PRORP RNase P enzymes in nuclei and organelles throughout the eukaryal tree.
| . | . | . | . | . | . | Nucleus Encoded . | . | Organelle Encoded . | . | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | . | . | . | Nuclear RNase P . | Organellar RNase P . | . | ||||
| . | . | . | . | . | . | RNP RNase P . | PRORP . | RNP RNase P . | . | |||
| Supergroups . | Subgroups . | . | . | . | Representative species . | P RNA . | P protein . | Nuclear . | Organellar . | P protein . | P RNA . | . |
| Opisthokonta | Holozoa | Metazoa | Animalia | Bilateria | Homo sapiens | na | na | ma | C | |||
| Radiata | Acropora digitifera | n | m | C | ||||||||
| Porifera | Amphimedon queenslandica | n | n | ? | ||||||||
| Placozoa | Trichoplax adhaerens | n | n | m | C | |||||||
| Choanomonada | Monosiga brevicollis | n | m | C | ||||||||
| Filasterea | Capsaspora owczarzaki | n | m | C | ||||||||
| Ichthyosporea | Amoebidium parasiticum | n | n | m | C | |||||||
| Sphaeroforma arctica | n | m | C | |||||||||
| Nucletmycea | Nuclearia | Nuclearia simplex | ||||||||||
| Fungi | Microsporidia | Encephalitozoon cuniculi | n | n | C | |||||||
| Chytridiomycota | Spizellomyces punctatus | n− | n | |||||||||
| Blastocladiales | Allomyces macrogynus | n | ||||||||||
| Mucoromycotina | Rhizopus oryzae | n | n | m | C | |||||||
| Mortierellaceae | Mortierella verticillata | n | n | m | C | |||||||
| Dikarya | Ascomycota | Saccharomyces cerevisiae | na | na | (m)a | (m)a | C | |||||
| Basidiomycota | Postia placenta | n | n | (m)a | C | |||||||
| Amoebozoa | Discosea | Acanthamoeba castellanii | n | n | ||||||||
| Archamoebae | Entamoeba dispar | n | C | |||||||||
| Myxogastria | Physarum polycephalum | n | ||||||||||
| Dictyostelia | Dictyostelium discoideum | na | na | |||||||||
| Archaeplastida | Glaucophyta | Cyanophora paradoxa | n′ | n | p | |||||||
| Rhodophyceae | Bangiales | Porphyra purpurea | p | |||||||||
| Cyanidiales | Cyanidioschyzon merolae | p | ||||||||||
| Florideophycidae | Chondrus crispus | p | ||||||||||
| Porphyridiophyceae | Porphyridium purpureum | n′ | p | |||||||||
| Chloroplastida (Viridiplantae) | Chlorophyta | Trebouxiophyceae | Chlorella variabilis | n | m, p | C | ||||||
| Chlorophyceae | Chlamydomonas reinhardtii | n | m, p | C | ||||||||
| Mamiellophyceae | Ostreococcus tauri | na | m, p | m, p | C | |||||||
| Charophyta | Streptophyta | Arabidopsis thaliana | na | m, pa | C | |||||||
| SAR | Stramenopiles | Blastocystis | Blastocystis hominis | ? | ||||||||
| Labyrinthulomycetes | Schizochytrium aggregatum | m | ||||||||||
| Pelagophyceae | Aureococcus anophagefferens | n | m, p | C | ||||||||
| Eustigmatales | Nannochloropsis gaditana | n | m, p | C | ||||||||
| Peronosporomycetes | Phytophthora sojae | n | m | C | ||||||||
| Phaeophyceae | Ectocarpus siliculosus | n | m, p | C | ||||||||
| Diatomea | Thalassiosira pseudonana | n | m, p | C | ||||||||
| Alveolata | Protalveolata | Perkinsus marinus | ? | |||||||||
| Dinoflagellata | Karenia brevis | ? | ||||||||||
| Apicomplexa | Haemosporidia | Plasmodium falciparum | n | n | a | C | ||||||
| Ciliophora | Tetrahymena thermophila | n | ||||||||||
| Rhizaria | Cercozoa | Bigelowiella natans | n | m, p | C | |||||||
| Retaria | Foraminifera | Reticulomyxa filosa | ? | |||||||||
| Excavata | Metamonada | Fornicata | Diplomonadida | Giardia lamblia | n | n | C | |||||
| Parabasalia | Trichomonas vaginalis | n | C | |||||||||
| Discoba | Jakobida | Reclinomonas americana | m | |||||||||
| Discicristata | Heterolobosea | Naegleria gruberi | n | |||||||||
| Euglenozoa | Euglenea | Euglena mutabilis | ? | |||||||||
| Trypanosomatida | Trypanosoma brucei | na | ma | C | ||||||||
| Relation to supergroups unclear | Apusomonadida | Thecamonas trahens | n | n | m | C | ||||||
| Cryptophyceae | Guillardia theta | n | m, p | C | ||||||||
| Haptophyta | Prymnesiophyceae | Emiliania huxleyi | m | |||||||||
| . | . | . | . | . | . | Nucleus Encoded . | . | Organelle Encoded . | . | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | . | . | . | Nuclear RNase P . | Organellar RNase P . | . | ||||
| . | . | . | . | . | . | RNP RNase P . | PRORP . | RNP RNase P . | . | |||
| Supergroups . | Subgroups . | . | . | . | Representative species . | P RNA . | P protein . | Nuclear . | Organellar . | P protein . | P RNA . | . |
| Opisthokonta | Holozoa | Metazoa | Animalia | Bilateria | Homo sapiens | na | na | ma | C | |||
| Radiata | Acropora digitifera | n | m | C | ||||||||
| Porifera | Amphimedon queenslandica | n | n | ? | ||||||||
| Placozoa | Trichoplax adhaerens | n | n | m | C | |||||||
| Choanomonada | Monosiga brevicollis | n | m | C | ||||||||
| Filasterea | Capsaspora owczarzaki | n | m | C | ||||||||
| Ichthyosporea | Amoebidium parasiticum | n | n | m | C | |||||||
| Sphaeroforma arctica | n | m | C | |||||||||
| Nucletmycea | Nuclearia | Nuclearia simplex | ||||||||||
| Fungi | Microsporidia | Encephalitozoon cuniculi | n | n | C | |||||||
| Chytridiomycota | Spizellomyces punctatus | n− | n | |||||||||
| Blastocladiales | Allomyces macrogynus | n | ||||||||||
| Mucoromycotina | Rhizopus oryzae | n | n | m | C | |||||||
| Mortierellaceae | Mortierella verticillata | n | n | m | C | |||||||
| Dikarya | Ascomycota | Saccharomyces cerevisiae | na | na | (m)a | (m)a | C | |||||
| Basidiomycota | Postia placenta | n | n | (m)a | C | |||||||
| Amoebozoa | Discosea | Acanthamoeba castellanii | n | n | ||||||||
| Archamoebae | Entamoeba dispar | n | C | |||||||||
| Myxogastria | Physarum polycephalum | n | ||||||||||
| Dictyostelia | Dictyostelium discoideum | na | na | |||||||||
| Archaeplastida | Glaucophyta | Cyanophora paradoxa | n′ | n | p | |||||||
| Rhodophyceae | Bangiales | Porphyra purpurea | p | |||||||||
| Cyanidiales | Cyanidioschyzon merolae | p | ||||||||||
| Florideophycidae | Chondrus crispus | p | ||||||||||
| Porphyridiophyceae | Porphyridium purpureum | n′ | p | |||||||||
| Chloroplastida (Viridiplantae) | Chlorophyta | Trebouxiophyceae | Chlorella variabilis | n | m, p | C | ||||||
| Chlorophyceae | Chlamydomonas reinhardtii | n | m, p | C | ||||||||
| Mamiellophyceae | Ostreococcus tauri | na | m, p | m, p | C | |||||||
| Charophyta | Streptophyta | Arabidopsis thaliana | na | m, pa | C | |||||||
| SAR | Stramenopiles | Blastocystis | Blastocystis hominis | ? | ||||||||
| Labyrinthulomycetes | Schizochytrium aggregatum | m | ||||||||||
| Pelagophyceae | Aureococcus anophagefferens | n | m, p | C | ||||||||
| Eustigmatales | Nannochloropsis gaditana | n | m, p | C | ||||||||
| Peronosporomycetes | Phytophthora sojae | n | m | C | ||||||||
| Phaeophyceae | Ectocarpus siliculosus | n | m, p | C | ||||||||
| Diatomea | Thalassiosira pseudonana | n | m, p | C | ||||||||
| Alveolata | Protalveolata | Perkinsus marinus | ? | |||||||||
| Dinoflagellata | Karenia brevis | ? | ||||||||||
| Apicomplexa | Haemosporidia | Plasmodium falciparum | n | n | a | C | ||||||
| Ciliophora | Tetrahymena thermophila | n | ||||||||||
| Rhizaria | Cercozoa | Bigelowiella natans | n | m, p | C | |||||||
| Retaria | Foraminifera | Reticulomyxa filosa | ? | |||||||||
| Excavata | Metamonada | Fornicata | Diplomonadida | Giardia lamblia | n | n | C | |||||
| Parabasalia | Trichomonas vaginalis | n | C | |||||||||
| Discoba | Jakobida | Reclinomonas americana | m | |||||||||
| Discicristata | Heterolobosea | Naegleria gruberi | n | |||||||||
| Euglenozoa | Euglenea | Euglena mutabilis | ? | |||||||||
| Trypanosomatida | Trypanosoma brucei | na | ma | C | ||||||||
| Relation to supergroups unclear | Apusomonadida | Thecamonas trahens | n | n | m | C | ||||||
| Cryptophyceae | Guillardia theta | n | m, p | C | ||||||||
| Haptophyta | Prymnesiophyceae | Emiliania huxleyi | m | |||||||||
n, m, p, and a indicate the identification of sequences in the respective phylogenetic subgroup and their predicted or experimentally verified localization to either the nucleus, mitochondria, plastids or apicoplasts, respectively; (m), the corresponding genes are found in some mitochondrial genomes, but not for all species; ?, nuclear-encoded sequences for which localization predictions could not be obtained; a, lineages for which RNase P enzymes were experimentally validated. ′ and −, P RNA candidates with some (′) or more severe (−) deviation from the consensus. Empty cells correspond to lineages where RNase P-related sequences could not be found. Gray cells correspond to lineages in which the organelles do not have a genome. Light gray cells correspond to lineages for which nuclear genome sequencing projects are not complete, although partial sequence information is available. Finally, C indicates the correlation between the predicted occurrence of a given type of enzyme and the absence of the other one (RNP or PRORP RNase P) in a specific lineage and/or compartment.
Overview of the occurrence of RNP and PRORP RNase P enzymes in nuclei and organelles throughout the eukaryal tree.
| . | . | . | . | . | . | Nucleus Encoded . | . | Organelle Encoded . | . | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | . | . | . | Nuclear RNase P . | Organellar RNase P . | . | ||||
| . | . | . | . | . | . | RNP RNase P . | PRORP . | RNP RNase P . | . | |||
| Supergroups . | Subgroups . | . | . | . | Representative species . | P RNA . | P protein . | Nuclear . | Organellar . | P protein . | P RNA . | . |
| Opisthokonta | Holozoa | Metazoa | Animalia | Bilateria | Homo sapiens | na | na | ma | C | |||
| Radiata | Acropora digitifera | n | m | C | ||||||||
| Porifera | Amphimedon queenslandica | n | n | ? | ||||||||
| Placozoa | Trichoplax adhaerens | n | n | m | C | |||||||
| Choanomonada | Monosiga brevicollis | n | m | C | ||||||||
| Filasterea | Capsaspora owczarzaki | n | m | C | ||||||||
| Ichthyosporea | Amoebidium parasiticum | n | n | m | C | |||||||
| Sphaeroforma arctica | n | m | C | |||||||||
| Nucletmycea | Nuclearia | Nuclearia simplex | ||||||||||
| Fungi | Microsporidia | Encephalitozoon cuniculi | n | n | C | |||||||
| Chytridiomycota | Spizellomyces punctatus | n− | n | |||||||||
| Blastocladiales | Allomyces macrogynus | n | ||||||||||
| Mucoromycotina | Rhizopus oryzae | n | n | m | C | |||||||
| Mortierellaceae | Mortierella verticillata | n | n | m | C | |||||||
| Dikarya | Ascomycota | Saccharomyces cerevisiae | na | na | (m)a | (m)a | C | |||||
| Basidiomycota | Postia placenta | n | n | (m)a | C | |||||||
| Amoebozoa | Discosea | Acanthamoeba castellanii | n | n | ||||||||
| Archamoebae | Entamoeba dispar | n | C | |||||||||
| Myxogastria | Physarum polycephalum | n | ||||||||||
| Dictyostelia | Dictyostelium discoideum | na | na | |||||||||
| Archaeplastida | Glaucophyta | Cyanophora paradoxa | n′ | n | p | |||||||
| Rhodophyceae | Bangiales | Porphyra purpurea | p | |||||||||
| Cyanidiales | Cyanidioschyzon merolae | p | ||||||||||
| Florideophycidae | Chondrus crispus | p | ||||||||||
| Porphyridiophyceae | Porphyridium purpureum | n′ | p | |||||||||
| Chloroplastida (Viridiplantae) | Chlorophyta | Trebouxiophyceae | Chlorella variabilis | n | m, p | C | ||||||
| Chlorophyceae | Chlamydomonas reinhardtii | n | m, p | C | ||||||||
| Mamiellophyceae | Ostreococcus tauri | na | m, p | m, p | C | |||||||
| Charophyta | Streptophyta | Arabidopsis thaliana | na | m, pa | C | |||||||
| SAR | Stramenopiles | Blastocystis | Blastocystis hominis | ? | ||||||||
| Labyrinthulomycetes | Schizochytrium aggregatum | m | ||||||||||
| Pelagophyceae | Aureococcus anophagefferens | n | m, p | C | ||||||||
| Eustigmatales | Nannochloropsis gaditana | n | m, p | C | ||||||||
| Peronosporomycetes | Phytophthora sojae | n | m | C | ||||||||
| Phaeophyceae | Ectocarpus siliculosus | n | m, p | C | ||||||||
| Diatomea | Thalassiosira pseudonana | n | m, p | C | ||||||||
| Alveolata | Protalveolata | Perkinsus marinus | ? | |||||||||
| Dinoflagellata | Karenia brevis | ? | ||||||||||
| Apicomplexa | Haemosporidia | Plasmodium falciparum | n | n | a | C | ||||||
| Ciliophora | Tetrahymena thermophila | n | ||||||||||
| Rhizaria | Cercozoa | Bigelowiella natans | n | m, p | C | |||||||
| Retaria | Foraminifera | Reticulomyxa filosa | ? | |||||||||
| Excavata | Metamonada | Fornicata | Diplomonadida | Giardia lamblia | n | n | C | |||||
| Parabasalia | Trichomonas vaginalis | n | C | |||||||||
| Discoba | Jakobida | Reclinomonas americana | m | |||||||||
| Discicristata | Heterolobosea | Naegleria gruberi | n | |||||||||
| Euglenozoa | Euglenea | Euglena mutabilis | ? | |||||||||
| Trypanosomatida | Trypanosoma brucei | na | ma | C | ||||||||
| Relation to supergroups unclear | Apusomonadida | Thecamonas trahens | n | n | m | C | ||||||
| Cryptophyceae | Guillardia theta | n | m, p | C | ||||||||
| Haptophyta | Prymnesiophyceae | Emiliania huxleyi | m | |||||||||
| . | . | . | . | . | . | Nucleus Encoded . | . | Organelle Encoded . | . | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | . | . | . | Nuclear RNase P . | Organellar RNase P . | . | ||||
| . | . | . | . | . | . | RNP RNase P . | PRORP . | RNP RNase P . | . | |||
| Supergroups . | Subgroups . | . | . | . | Representative species . | P RNA . | P protein . | Nuclear . | Organellar . | P protein . | P RNA . | . |
| Opisthokonta | Holozoa | Metazoa | Animalia | Bilateria | Homo sapiens | na | na | ma | C | |||
| Radiata | Acropora digitifera | n | m | C | ||||||||
| Porifera | Amphimedon queenslandica | n | n | ? | ||||||||
| Placozoa | Trichoplax adhaerens | n | n | m | C | |||||||
| Choanomonada | Monosiga brevicollis | n | m | C | ||||||||
| Filasterea | Capsaspora owczarzaki | n | m | C | ||||||||
| Ichthyosporea | Amoebidium parasiticum | n | n | m | C | |||||||
| Sphaeroforma arctica | n | m | C | |||||||||
| Nucletmycea | Nuclearia | Nuclearia simplex | ||||||||||
| Fungi | Microsporidia | Encephalitozoon cuniculi | n | n | C | |||||||
| Chytridiomycota | Spizellomyces punctatus | n− | n | |||||||||
| Blastocladiales | Allomyces macrogynus | n | ||||||||||
| Mucoromycotina | Rhizopus oryzae | n | n | m | C | |||||||
| Mortierellaceae | Mortierella verticillata | n | n | m | C | |||||||
| Dikarya | Ascomycota | Saccharomyces cerevisiae | na | na | (m)a | (m)a | C | |||||
| Basidiomycota | Postia placenta | n | n | (m)a | C | |||||||
| Amoebozoa | Discosea | Acanthamoeba castellanii | n | n | ||||||||
| Archamoebae | Entamoeba dispar | n | C | |||||||||
| Myxogastria | Physarum polycephalum | n | ||||||||||
| Dictyostelia | Dictyostelium discoideum | na | na | |||||||||
| Archaeplastida | Glaucophyta | Cyanophora paradoxa | n′ | n | p | |||||||
| Rhodophyceae | Bangiales | Porphyra purpurea | p | |||||||||
| Cyanidiales | Cyanidioschyzon merolae | p | ||||||||||
| Florideophycidae | Chondrus crispus | p | ||||||||||
| Porphyridiophyceae | Porphyridium purpureum | n′ | p | |||||||||
| Chloroplastida (Viridiplantae) | Chlorophyta | Trebouxiophyceae | Chlorella variabilis | n | m, p | C | ||||||
| Chlorophyceae | Chlamydomonas reinhardtii | n | m, p | C | ||||||||
| Mamiellophyceae | Ostreococcus tauri | na | m, p | m, p | C | |||||||
| Charophyta | Streptophyta | Arabidopsis thaliana | na | m, pa | C | |||||||
| SAR | Stramenopiles | Blastocystis | Blastocystis hominis | ? | ||||||||
| Labyrinthulomycetes | Schizochytrium aggregatum | m | ||||||||||
| Pelagophyceae | Aureococcus anophagefferens | n | m, p | C | ||||||||
| Eustigmatales | Nannochloropsis gaditana | n | m, p | C | ||||||||
| Peronosporomycetes | Phytophthora sojae | n | m | C | ||||||||
| Phaeophyceae | Ectocarpus siliculosus | n | m, p | C | ||||||||
| Diatomea | Thalassiosira pseudonana | n | m, p | C | ||||||||
| Alveolata | Protalveolata | Perkinsus marinus | ? | |||||||||
| Dinoflagellata | Karenia brevis | ? | ||||||||||
| Apicomplexa | Haemosporidia | Plasmodium falciparum | n | n | a | C | ||||||
| Ciliophora | Tetrahymena thermophila | n | ||||||||||
| Rhizaria | Cercozoa | Bigelowiella natans | n | m, p | C | |||||||
| Retaria | Foraminifera | Reticulomyxa filosa | ? | |||||||||
| Excavata | Metamonada | Fornicata | Diplomonadida | Giardia lamblia | n | n | C | |||||
| Parabasalia | Trichomonas vaginalis | n | C | |||||||||
| Discoba | Jakobida | Reclinomonas americana | m | |||||||||
| Discicristata | Heterolobosea | Naegleria gruberi | n | |||||||||
| Euglenozoa | Euglenea | Euglena mutabilis | ? | |||||||||
| Trypanosomatida | Trypanosoma brucei | na | ma | C | ||||||||
| Relation to supergroups unclear | Apusomonadida | Thecamonas trahens | n | n | m | C | ||||||
| Cryptophyceae | Guillardia theta | n | m, p | C | ||||||||
| Haptophyta | Prymnesiophyceae | Emiliania huxleyi | m | |||||||||
n, m, p, and a indicate the identification of sequences in the respective phylogenetic subgroup and their predicted or experimentally verified localization to either the nucleus, mitochondria, plastids or apicoplasts, respectively; (m), the corresponding genes are found in some mitochondrial genomes, but not for all species; ?, nuclear-encoded sequences for which localization predictions could not be obtained; a, lineages for which RNase P enzymes were experimentally validated. ′ and −, P RNA candidates with some (′) or more severe (−) deviation from the consensus. Empty cells correspond to lineages where RNase P-related sequences could not be found. Gray cells correspond to lineages in which the organelles do not have a genome. Light gray cells correspond to lineages for which nuclear genome sequencing projects are not complete, although partial sequence information is available. Finally, C indicates the correlation between the predicted occurrence of a given type of enzyme and the absence of the other one (RNP or PRORP RNase P) in a specific lineage and/or compartment.
In brief, a P RNA and RPP21 are prevalent among the Holozoa subgroup of Opisthokonta. Within metazoans, P RNA candidates were newly identified in the more basal Placozoa, Porifera, and in radially symmetric animals (supplementary figs. S1–S5, Supplementary Material online). In Nucletmycea, P RNAs are identifiable in all branches except for Nuclearia. Among Amoebozoa, nuclear RNP RNase P is generally present. Relative to previous analyses (Marquez et al. 2005; Piccinelli et al. 2005), we predicted additional P RNAs and RPP21 homologs in Archamoebae and Dictyostelia. In contrast, within the photosynthetic supergroup Archaeplastida (plants and algae with chloroplasts of primary endosymbiotic origin), RNP RNase P appears absent from the nuclei of Chloroplastida. However, P RNAs are predicted in glaucophytes and in rhodophytes. In the SAR (Stramenopiles, Alveolata, Rhizaria) group, P RNA and RPP21 were not identified in Stramenopiles, consistent with previous studies (Hartmann E and Hartmann RK 2003; Piccinelli et al. 2005; Rosenblad et al. 2006), but were found in Ciliophora and Apicomplexa genomes (Alveolata). In Excavata, the occurrence of nuclear RNP RNase P is widespread, but appears to have been lost in Euglenozoa. In Haptophyta and Cryptophyceae, P RNA or RPP21 could not be identified; yet, genome information is scarce in these clades and it remains unclear whether this is due to the loss of RNP RNase P or to structurally highly deviant P RNA and RPP21 homologs.
Incidence of Organellar Ribonucleoprotein RNase P
Mitochondria (mt) and plastids (pls) possess their own genome coding for a complete or partial set of tRNAs. They originated from primary endosymbiosis with an ancestral α-proteobacterium and a cyanobacterium, respectively, yet pls also derive from secondary or tertiary endosymbiosis in various groups. It is thus not surprising to find bacterial-like P RNAs still encoded in some organellar genomes. Organelle RNP RNase P, however, is particularly diverse (Rossmanith 2012) and P RNAs are highly degenerate in some cases (Seif et al. 2005). We have (re)analyzed the occurrence of mt and pl–P RNAs in organelle genomes throughout Eukarya as well as the occurrence of RnpA and Rpm2, two proteins of organellar RNP RNase P. The comprehensive list of all identified organellar P RNAs and proteins is given in supplementary table S1, Supplementary Material online, and summarized in table 1.
In short, in the supergroup Opisthokonta, no P RNA gene was found in the mitochondrial genomes of Holozoa. Most mitochondrial genes were found in the fungal lineage particularly in saccharomycetaceaen species. Among Archaeplastida, a patchy occurrence of P RNAs was found in organellar genomes of phylogenetically basal alga including Glaucophyta, Rhodophyceae, and Chlorophyta. No P RNA gene was found in Streptophyta. Most, if not all, pl-encoded P RNAs were found in primary photosynthetic Eukarya. In Excavata, P RNAs were only found in jakobid mtDNAs (Seif et al. 2006). Finally, in the groups of amoebozoa and SAR, organellar P RNA appears to be scarce. All in all, organelle P RNA occurrence is patchy. In some phyla, they were either lost or their sequences have diverged to an extent that makes them undetectable by recognition algorithms used here. Protein subunits of these enzymes are even more elusive. The subunits previously identified are bacterial-type RNase P proteins (RnpA) and a pentatricopeptide repeat (PPR) protein called Rpm2, both nuclear encoded and unrelated to PRORP. Within the fungal branch, Rpm2 was shown to be part of mitochondrial RNase P in S. cerevisiae (Morales et al. 1992; Daoud et al. 2012). Close Rpm2 homologs are only found in Saccharomycetales (supplementary fig. S6, Supplementary Material online). In Archaeplastida, no P protein of bacterial origin is encoded in any organellar genome, although rnpA-like genes are encoded in several nuclear genomes in Mammiellophyceae of the Chlorophyta subgroup, (Lai et al. 2011) and these RnpA proteins are predicted to localize to organelles (supplementary table S2, Supplementary Material online). Our analysis and three-dimensional structure predictions revealed that these algae RnpAs are characterized by N- and C-terminal extensions not present in bacterial RnpAs (supplementary figs. S7 and Supplementary Data, Supplementary Material online). Their function is unknown, but might be involved in specific contacts with algae organellar P RNAs or with yet unidentified proteins.
Incidence of Protein-Only RNase P
Our analyses confirm and substantiate previous observations that a number of eukaryal groups lack RNase P genes for a nuclear and/or organellar RNP enzyme. We thus performed a systematic analysis of the distribution and localization of putative PRORP enzymes to determine whether PRORP could be the RNase P in these lineages/compartments. As a prerequisite, we had to define robust features characterizing PRORP. Candidates were only considered as genuine PRORPs when their architecture included a specific C-terminal NYN (N4BP1, YacP-like Nuclease) metallonuclease domain presumably originating from the bacterial ribonuclease yacP (Anantharaman and Aravind 2006), an N-terminal α-super helical domain containing PPR motifs (Small et al. 2004) as revealed by systematic structure predictions and a bipartite zinc-binding module connecting the two main domains. Further signatures are present in specific phyla. Their occurrence might point out additionally acquired functions or interactions with phylum-specific proteins that remain to be identified (fig. 1).
Description of the conserved features defining PRORP proteins. (A) Schematic representation of the different domains of PRORP. Sequence logos of residue conservation for the subdomains involved in zinc binding, as well as for a plant-specific glycine-rich insertion and for a “hydrophobic domain” conserved in organisms that contain a plastid (or had contained a plastid) were generated with WebLogo 3. The number of sequences analyzed and the percentage of sequences originating from animals (Metazoa), plants (land plants), or other organisms (Chlorophyta, Stramenopiles, Alveolata, Cryptophyceae, Haptophyta, Rhizaria, Choanoflagellates, Filastera, Ichtyosporea) are as follows from left to right: Plant-specific insertion: 169 sequences (100% land plants); N-terminal ½ Zn binding domain 1: 275 sequences (1/2 land plants, 1/3 metazoa, 1/6 others); hydrophobic domain: 138 sequences (60% land plants); C-terminal ½ Zn binding domain 2: 249 sequences (1/3 land plant, 1/3 metazoa, 1/3 other). OTS, organellar targeting signals (to mitochondria, plastids, or apicoplasts); NLS, nuclear localization signal. (B) Conserved residues present in the PRORP-defining NYN domain signatures, specified for different phyla. The positions of the eight residues constituting part 1 of the NYN signature of PRORP have been numbered as indicated above the first logo. Numbers between the conserved motifs indicate the distance range (in amino acids) that separate the motifs in the different PRORPs analysed. (C) Three-dimensional structure predictions for N-terminal domains of representative PRORP proteins considered in this analysis. All the putative PRORPs have an α-superhelical domain consistent with the conserved fold of PPR proteins. N-terminal extremities are shown on the left, C-terminal ones on the right.
Based on these common features, we searched for putative PRORP genes in the three domains of life. We confirmed that PRORP proteins are Eukarya specific, exclusively encoded in nuclear genomes and widely distributed, that is, found in four of the five eukaryal supergroups. The full set of putative PRORPs is given in supplementary table S1, Supplementary Material online, and summarized in table 1. Briefly, among Opisthokonta, PRORPs are present in Metazoa and all the associated lineages (Choanomonada, Filasterea, and Ichthyosporea), but absent from fungi and associated lineages. No PRORP could be identified in the supergroup of Amoebozoa. Among Archaeplastida, PRORP was not found in the basal groups such as Glaucophyta and Rhodophyta, but was found in all Chlorophyta and Charophyta as single genes, while in Embryophyta, more than two PRORPs were typically found. In Spermatophyta, PRORP sequences can be subdivided into three evolutionary distinct clusters that we term cluster I, II, and III (supplementary fig. S9, Supplementary Material online). Most of the species have three PRORPs with one representative of each cluster. However, the Brassicaceae (e.g., Arabidopsis) make an exception, because Arabidopsis PRORP2 and 3 both belong to cluster III. PRORPs are also found in the supergroup SAR, two to three PRORP proteins are encoded in all Stramenopiles. In Alveolata, no genes coding for PRORPs were found in ciliates, but a single gene could be identified in all Apicomplexa genomes. Among Excavata, PRORP is found in the sequenced genomes of some Discoba organisms but not in Metamonada. Although present in Euglenozoa, it is not identifiable in Heterolobosea.
To gain insight into the origin and distribution of PRORP, a phylogenetic analysis was performed. The results suggest an ancient origin of PRORP (supplementary fig. S10, Supplementary Material online). Still, in some instances PRORP might also have spread during horizontal gene transfer (HGT) events such as secondary and tertiary endosymbiosis. This might have happened, for example, in stramenopiles where, among individual species, multiple PRORPs cluster in evolutionary distinct groups (supplementary fig. S10, Supplementary Material online).
Although the prevalence of PRORP in Eukarya could be established, understanding the distribution of RNP and PRORP in specific compartments requires to know the precise subcellular localization of PRORPs in the respective lineages. To gain such information, we applied localization prediction tools to full-length PRORP sequences. The results are compiled in supplementary table S2, Supplementary Material online, and summarized in table 1. In short, in Opisthonkonta, all animal PRORPs are mitochondrial. In green algae single PRORP genes might encode both nuclear and organellar PRORPs expressed by alternative translation starts. In land plants, cluster III contains nuclear orthologs of PRORP, while cluster I and II PRORPs are predicted to be organellar. In other groups, SAR, Excavata, Crypthophyceae, or Haptophyta, multiple PRORPs can be targeted to mt and nuclei, or a single PRORP can be found in specific compartments as, for example, in the apicoplast of apicomplexan. Overall, the predicted localizations confirm that PRORP proteins are not restricted to organelles as initially envisaged (Lai et al. 2010), but demonstrates that they are also widespread in nuclei.
Conclusions and Possible Scenarios for the Evolution of RNase P Distribution
In most instances our analyses revealed a correlation between the predicted occurrence of a given type of enzyme (RNP RNase P or PRORP) and the absence of the other one in a specific lineage and/or compartment. The most divergent examples are fungi, where RNP enzymes are active in both mt and nuclei while PRORP is absent, and Streptophyta or Trypanosomatida, where PRORPs are found in organelles and nuclei, whereas RNP genes are absent. Similar correlations are summarized in table 1 for all Eukarya groups.
Our analysis implies that PRORP might have evolved very early during eukaryal evolution, in an organism at the root of modern Eukarya (fig. 2), although its distribution points to some HGT events as well. It appears likely that the fusion of PPR, NYN, and all the features defining PRORP took place only once during evolution. The RNP and protein-only forms of RNase P thus probably coexisted in an early eukaryote, a functional redundancy that, however, might not have persisted in any organism to the present. We did not find solid evidence for this coexistence within the same compartment, although it cannot be ruled out for some Mamiellophyceae, where isoforms of PRORP might be targeted to both nuclei and organelles while RNP RNase P has been retained in organelles. RNP was kept in some organisms (fungi) or compartment (nucleus of metazoa) and protein-only enzymes were not retained. In these organisms, RNPs might have gained additional functions that could not be provided by PRORP, for example, as observed in human nuclei with the requirement of RNP RNase P for the formation of RNA polymerase III initiation complexes (Serruya et al. 2015). In contrast, PRORP was kept in other organisms (some chlorophytes, streptophytes, trypanosomids) or in specific compartments (nucleus of other chlorophytes and mt of metazoans) and RNPs were lost. Similarly, PRORPs targeted to organelles might have coexisted with RNP RNases P encoded in organellar genomes. P RNA genes might have been lost in the course of rearrangements of organellar genomes, consolidating PRORP as the RNase P enzyme in this compartment.
Distribution of RNP and PRORP RNase P enzymes in the eukaryal domain of life. Relations between eukaryal groups are schematically indicated according to Petersen et al. (2014). R and P indicate the occurrence of RNP and PRORP RNase P enzymes in the respective groups, based on the study presented here. Crossing out P or R indicates putative evolutionary events associated with the loss of PRORP or (nuclear) RNP RNase P. The question mark indicates an example where limited genomic data prevented conclusions as to the occurrence of the given enzyme type in the respective group. The diagram highlights how the distribution of RNase P seemingly involved multiple events of losses of either PRORP or RNP RNase P.
In animal and plant lineages, RNase P distribution followed two different routes. Unicellular organisms basal to Metazoa (Ichtyosporea, Filasterea, Choanomonada) seem to have retained PRORP proteins for mitochondrial RNase P function and this status was also preserved in all metazoan species. In contrast, unicellular organisms basal to Chlorophyta seem to have initially retained PRORP enzymes only for nuclear RNase P activity. Then, in more recent species of the Chloroplastida lineage, PRORP also took over the organellar RNase P function.
In conclusion, looking at the global picture, since its origin PRORP seems to have been an invasive enzyme, taking over the function of ancestral RNP RNase P in several eukaryal groups, in entire organisms, or in given cellular compartments. The evolutionary trend to replace RNP with PRORP becomes plausible if one considers its capability to instantly replace RNP enzymes in tRNA biogenesis, as experimentally demonstrated for the E. coli and yeast systems (Gobert et al. 2010; Taschner et al. 2012; Weber et al. 2014). This evolution may witness a still continuing transitional process from the RNA to the protein world.
Materials and Methods
Identification of Nuclear-Encoded RNase P RNAs
We identified P RNAs using Infernal (Nawrocki and Eddy 2013) with an E-value threshold of 1 × 10−8 based on the RFAM 12.0 (Nawrocki et al. 2015) models RF00009 (Nuclear RNase P) and RF01577 (Plasmodium RNase P). In addition, we used the tool Bcheck (Yusuf et al. 2010) with default parameters. The predictions were curated and assessed manually for their conserved core. This ensemble of methods also allows discriminating P RNAs from MRP RNAs.
Search for Homologs of the RNP RNase P-Specific Protein Subunit RPP21
We selected reference sequences from several sources: 1) The Rpr2 alignment provided by Rosenblad et al. (2006), 2) the seed alignment provided for the PFAM family PF04032 (RNase P Rpr2/Rpp21/SNM1 subunit domain) (Finn et al. 2011), and 3) WormBase version WS247 (Harris et al. 2010) gene Y37E11B.6 (rpp21). Reference domains were identified and a scoring algorithm was implemented based on regular expressions.
Identification of Rpm2p and Mitochondrial P RNAs in Fungi
The HMMER algorithm (Finn et al. 2011) as well as BLAST searches (Altschul et al. 1990) were used to retrieve proteins with homology to the Rpm2 domain as defined in PFAM (Finn et al. 2014). Putative rpm1 was retrieved from unannotated fungal mitochondrial genomes with RNAweasel (Gautheret and Lambert 2001).
PRORP Sequence Analysis and Structural Predictions
PRORP sequences were retrieved using the BLAST tool in NCBI (National Center for Biotechnology Information), Ensembl, Bogas, Phytozome, JGI, and Broad. The proteins were aligned using MUSCLE (Edgar 2004). The sequences of these domains were then retrieved and realigned with MUSCLE before using WebLogo 3 (Crooks et al. 2004) to highlight the conserved residues. Protein structures were predicted using the Phyre2 algorithm in the intensive modeling mode (Kelley and Sternberg 2009).
Subcellular Localization Predictions
Subcellular localization predictions were determined for most proteins with TargetP, Predotar, and MultiLoc2 when applicable (Small et al. 2004; Emanuelsson et al. 2007; Blum et al. 2009). PredAlgo was used for PRORP sequences of green algae (Chlorophyta) (Tardif et al. 2012). PlasmoAP and PATS were used for Apicomplexa PRORP in order to determine if they possess an apicoplast targeting peptide (Zuegge et al. 2001; Foth et al. 2003).
Phylogenetic Analyses of PRORP
Phylogenetic analysis of PRORP protein sequences were performed with the maximum-likelihood method with 100 bootstrap replicates (Dereeper et al. 2008).
Acknowledgments
This work was supported by the “Centre National de la Recherche Scientifique,” the University of Strasbourg, the Medical University of Vienna, and the Philipps-University of Marburg. We thank Prof. B.F. Lang for critical discussion on the evolution of RNase P. This work was supported by Agence Nationale de la Recherche (grant PRO-RNase P, ANR 11 BSV8 008 01 to P.G.), LabEx consortium “MitoCross,” the German Research Foundation (grants HA 1672/17-1 and IRTG 1384 to R.K.H.), and the Austrian Science Fund (grant I299 to W.R).
References
Author notes
Associate editor: Claus Wilke

