Abstract

We recently determined the genome sequence of the Coccolithoviridae strain Emiliania huxleyi virus 86 (EhV-86), a giant double-stranded DNA (dsDNA) algal virus from the family Phycodnaviridae that infects the marine coccolithophorid E. huxleyi. Here, we determine the phylogenetic relationship between EhV-86 and other large dsDNA viruses. Twenty-five core genes common to nuclear-cytoplasmic large dsDNA virus genomes were identified in the EhV-86 genome; sequence from eight of these genes were used to create a phylogenetic tree in which EhV-86 was placed firmly with the two other members of the Phycodnaviridae. We have also identified a 100-kb region of the EhV-86 genome which appears to have transferred into this genome from an unknown source. Furthermore, the presence of six RNA polymerase subunits (unique among the Phycodnaviridae) suggests both a unique evolutionary history and a unique lifestyle for this intriguing virus.

Introduction

The category of virus is traditionally defined by biological characteristics, rather than by evolutionary roots. Classification was originally based on host range and morphology; however, it is now common to classify, primarily, according to the type of genome (single- or double-stranded [ds] RNA or DNA) (Bamford, Burnett, and Stuart 2002). The advent of the genomic era, leading to the increased availability of gene and genome sequence data, has allowed the evolutionary relationships within and between families of DNA viruses to begin to be established (Shackelton and Holmes 2004). High evolutionary rates, horizontal gene transfer, and nonorthologous gene displacement make accurate phylogenetic resolution difficult over greater periods of time, especially when using single-gene phylogenetic trees (Filee, Forterre, and Laurent 2003). Consequently, studies on whole genomes are becoming increasingly more popular when constructing evolutionary relationships between virus families.

Recently, whole genome comparisons have identified a group of large dsDNA viruses that are likely to have shared a common ancestor (Iyer, Aravind, and Koonin 2001). The nuclear-cytoplasmic large double-stranded DNA virus (NCLDV) group is composed of at least five families that replicate in the nucleus and/or cytoplasm of eukaryotic cells (Poxviridae, Iridoviridae, Asfarviridae, Phycodnaviridae, and a newly proposed member, Mimiviridae). These diverse families are likely to have shared a common ancestor which encoded complex systems for DNA replication and transcription, a redox protein, and a possible inhibitor of apoptosis. Nine genes are found to be shared by genomes from all family members (Group I), and a further 22 are found in at least three of the four families (Groups II and III) (Iyer, Aravind, and Koonin 2001). It is thought that the ancestral NCLDV was likely to have had both nuclear and cytoplasmic phases of its life cycle (Iyer, Aravind, and Koonin 2001). Lineage-specific gene loss and gain within the NCLDV families is thought to contribute to the highly diverse characteristics of present-day forms.

Poxviruses, asfarviruses, and iridoviruses encode their own transcription and replication machinery and undergo their replication cycle entirely in the cytoplasm (poxviruses) or start in the nucleus and complete in the cytoplasm (asfariviruses and iridoviruses). Less is known about the members of the highly diverse Phycodnaviridae family (which contains the genera Chlorovirus, Coccolithovirus, Prasinovirus, Prymnesiovirus, Phaeovirus, and Raphidovirus [Wilson et al. 2005b]). Preliminary analysis on a limited number of genomes (the phaeovirus, Ectocarpus siliculosus virus 1 [ESV-1], and the chlorovirus, Paramecium bursaria chlorella virus 1 [PBCV-1]) suggested that members of the Phycodnaviridae were characterized by the loss of genes encoding for RNA polymerases, leading to the hypothesis that they have predominantly nuclear life phases (Iyer, Aravind, and Koonin 2001). The recent sequencing of the Emiliania huxleyi virus 86 (EhV-86) genome (a coccolithovirus that infects the haptophyte E. huxleyi) has cast doubt on this assertion (Wilson et al. 2005a). EhV-86 encodes its own RNA polymerase, hence the Phycodnaviridae family must be even more diverse than previously thought and the EhV-86 lineage diverged from the ancestral Phycodnaviridae earlier than the Phaeovirus and Chlorovirus families. Furthermore, the recent sequencing of the mimivirus (Raoult et al. 2004) and its putative placement on a branch diverging prior to divergence of the Phycodnaviridae (based upon the absence of RNA polymerase genes) add further complexity to the history of NCLDV evolution. Here, we determine the evolutionary relationships between members of the NCLDV family including, for the first time, data from the recently sequenced EhV-86 genome. The phylogenetic analysis is based on six members of the Group I core proteins and the two large subunit RNA polymerases from Group III identified as being in the ancestral NCLDV genome (Iyer, Aravind, and Koonin 2001).

Materials and Methods

Viral Genome and Protein Sequence

Nucleotide sequences of the complete genomes of large dsDNA viruses and the corresponding predicted protein sequences were downloaded from the Virus Genomes division of the Entrez system (National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov/genomes/VIRUSES/viruses.html). The complete genomes of 19 viruses representing the diverse five families of NCLDV included in this analysis were from the following viruses: Asfarviridae: African swine fever virus (Yanez et al. 1995); Poxviridae: Amsacta moorei entomopoxvirus (AMEV) (Bawden et al. 2000), Melanoplus sanguinipes entomopoxvirus (MSEV) (Afonso et al. 1999), bovine papular stomatitis virus (Delhon et al. 2004), fowlpox virus (Afonso et al. 2000), sheeppox virus (Tulman et al. 2002), swinepox virus (Afonso et al. 2002), vaccinia virus (VACV) (Goebel et al. 1990), Molluscum contagiosum virus (MOCV) (Senkevich et al. 1996), myxoma virus (Cameron et al. 1999), Yaba monkey tumor virus (Brunetti et al. 2003); Phycodnaviridae: PBCV-1 (Li et al. 1997), ESV-1 (Delaroque et al. 2001), EhV-86 (Wilson et al. 2005a); Iridoviridae: frog virus 3 (Tan et al. 2004), invertebrate iridescent virus 6 (IIV-6) (Jakob et al. 2001), Regina ranavirus (Jancovich et al. 2003), lymphocystis disease virus 1 (LCDV) (Tidona and Darai 1997); and Mimiviridae: Mimivirus (Raoult et al. 2004).

Sequence and Phylogenetic Analysis

Protein sequences were compared using the BlastP and PSI-Blast programs (http://www.ncbi.nlm.nih.gov/BLAST). Conserved domains within the six members of the Group I proteins (D5-like ATPase, Pfam PF03288; DNA polymerase, Pfam PF00136; A32-like ATPase, SMART SM00382; A18-like helicase, Pfam PF00270; thiol-oxidoreductase; and D6R-like helicase, Pfam PF00176) and the two large RNA polymerase subunits (rpb1, SMART SM00663; rpb2, Pfam PF00562) from Group III were identified from the 19 viral genomes, and these were concatenated for phylogenetic analysis (www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml). Multiple alignments were performed using ClustalW (http://clustalw.genome.jp). Because only highly conserved domains were used, no further editing was required after alignment. Phylogenetic analysis of all the concatenate alignments was constructed using the various programs in PHYLIP (phylogeny inference package) version 3.6b (Felsenstein 1989), and the robustness of the alignments was tested with the bootstrapping option (SeqBoot). Genetic distances, applicable for distance matrix phylogenetic inference, were calculated using the Protdist program in the PHYLIP package. Phylogenetic inferences based on the distance matrix (Neighbor) and parsimony (Protpars) algorithms were applied to the alignments. In both trees, the best tree or majority rule consensus tree was selected using the consensus program (Consense). The trees were visualized and drawn using the TREEVIEW software version 2.1 (Page 1996).

Results and Discussion

We have previously used phylogenetic analysis of the DNA polymerase gene to propose that EhV-86 belongs to the then new genus “Coccolithovirus,” within the family of algal viruses, Phycodnaviridae (Schroeder et al. 2002). The subsequent sequencing of EhV-86 in its entirety (Wilson et al. 2005a) has allowed us to further characterize and elaborate on the evolutionary history of the Coccolithovirus genus. In the present study, we have determined the phylogenetic relationships between conserved domains among the core NCLDV proteins. Genes were originally assigned into groups based on their conservation profile among NCLDV genomes from the four families in the original study (Asfarviridae, Phycodnaviridae, Poxviridae, and Iridoviridae) (Iyer, Aravind, and Koonin 2001). Group I are conserved in all NCLDVs, Group II are conserved in all four of the original NCLDV families but missing in one or more lineages within families, Group III are conserved in three NCLDV families, and Group IV are conserved in two families (table 1).

Table 1

Presence of NCLDV Core Genes (Groups I, II, and III) in Various NCLDV Genomes


 

Group I
 
        
Group II
 
       
Group III
 
             
Family Species
 
Vaccinia Virus D5-Type ATPase (ehv459)a
 
DNA Polymerase (ehv030)
 
Vaccinia Virus A32-Type ATPase (ehv072)
 
Vaccinia Virus A18-Type Helicase (ehv104)
 
Capsid Protein (ehv085)
 
Thiol-oxidoreductase (ehv128)
 
Vaccinia Virus D6R-Type Helicase (ehv141)
 
Ser/Thr Protein Kinase (ehv141)
 
VLTF2-Like Transcription Factor (ehv438)
 
TFII-Like Transcription Factor (ehv105)
 
MuT-Like NTP Pyrophosphohydrolase (ehv398)
 
Myristolyated Virion Protein A
 
Proliferating Cell Nuclear Antigen (ehv020)
 
Ribonucleotide Reductase, Large Subunit (ehv428)
 
Ribonucleotide Reductase, Small Subunit (ehv026)
 
Thymidylate Kinase (ehv431)
 
dUTPase (ehv397)
 
A494R-Like Uncharacterized Protein (ehv403)
 
RuvC-Like Holliday Junction Resolvase
 
BroA-Like
 
Capping Enzyme (ehv453)
 
ATP-Dependent Ligase (ehv158)
 
RNA Polymerase, Subunit 1 (ehv064)b
 
RNA Polymerase, Subunit 2 (ehv434)b
 
Thioredoxin/Glutaredoxin (ehv465)
 
Ser/Thr Phosphatase
 
BIR Domain (ehv166)b
 
Virion-Associated Membrane Protein
 
Topoisomerase II (ehv166)
 
SW1/SNF1 Family Helicase
 
RNA Polymerase, Subunit 10 (ehv167)b
 
Phycodnaviridae                                
    EhV-86 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓  ✓ ✓ ✓ ✓ ✓ ✓   ✓ ✓ ✓ ✓ ✓  ✓  ✓  ✓ 
    PBCV-1 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓   ✓ ✓   ✓ ✓   ✓ ✓  
    ESV-1 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓    ✓ ✓ ✓   ✓ ✓ ✓     ✓       
Mimiviridae                                
    Mimivirus ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓   ✓  ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓  
Iridoviridae                                
    LCDV ✓ ✓ ✓  ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓  ✓     ✓ ✓    ✓  ✓  
    IIV-6 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓   ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ 
Asfarviridae                                
    ASFV ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓   ✓ ✓ ✓ ✓   ✓ ✓ ✓  ✓ 
Poxviridae                                
    VACV ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓  ✓  ✓ ✓ ✓ ✓ ✓ ✓  ✓   ✓ 
    MOCV ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓    ✓  ✓  ✓  ✓ ✓ ✓   ✓    
    AMEV ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓     ✓  ✓ ✓ ✓  ✓ ✓ ✓ ✓ ✓ ✓    
    MSEV
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Group I
 
        
Group II
 
       
Group III
 
             
Family Species
 
Vaccinia Virus D5-Type ATPase (ehv459)a
 
DNA Polymerase (ehv030)
 
Vaccinia Virus A32-Type ATPase (ehv072)
 
Vaccinia Virus A18-Type Helicase (ehv104)
 
Capsid Protein (ehv085)
 
Thiol-oxidoreductase (ehv128)
 
Vaccinia Virus D6R-Type Helicase (ehv141)
 
Ser/Thr Protein Kinase (ehv141)
 
VLTF2-Like Transcription Factor (ehv438)
 
TFII-Like Transcription Factor (ehv105)
 
MuT-Like NTP Pyrophosphohydrolase (ehv398)
 
Myristolyated Virion Protein A
 
Proliferating Cell Nuclear Antigen (ehv020)
 
Ribonucleotide Reductase, Large Subunit (ehv428)
 
Ribonucleotide Reductase, Small Subunit (ehv026)
 
Thymidylate Kinase (ehv431)
 
dUTPase (ehv397)
 
A494R-Like Uncharacterized Protein (ehv403)
 
RuvC-Like Holliday Junction Resolvase
 
BroA-Like
 
Capping Enzyme (ehv453)
 
ATP-Dependent Ligase (ehv158)
 
RNA Polymerase, Subunit 1 (ehv064)b
 
RNA Polymerase, Subunit 2 (ehv434)b
 
Thioredoxin/Glutaredoxin (ehv465)
 
Ser/Thr Phosphatase
 
BIR Domain (ehv166)b
 
Virion-Associated Membrane Protein
 
Topoisomerase II (ehv166)
 
SW1/SNF1 Family Helicase
 
RNA Polymerase, Subunit 10 (ehv167)b
 
Phycodnaviridae                                
    EhV-86 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓  ✓ ✓ ✓ ✓ ✓ ✓   ✓ ✓ ✓ ✓ ✓  ✓  ✓  ✓ 
    PBCV-1 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓   ✓ ✓   ✓ ✓   ✓ ✓  
    ESV-1 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓    ✓ ✓ ✓   ✓ ✓ ✓     ✓       
Mimiviridae                                
    Mimivirus ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓   ✓  ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓  
Iridoviridae                                
    LCDV ✓ ✓ ✓  ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓  ✓     ✓ ✓    ✓  ✓  
    IIV-6 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓   ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ 
Asfarviridae                                
    ASFV ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓   ✓ ✓ ✓ ✓   ✓ ✓ ✓  ✓ 
Poxviridae                                
    VACV ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓  ✓  ✓ ✓ ✓ ✓ ✓ ✓  ✓   ✓ 
    MOCV ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓    ✓  ✓  ✓  ✓ ✓ ✓   ✓    
    AMEV ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓     ✓  ✓ ✓ ✓  ✓ ✓ ✓ ✓ ✓ ✓    
    MSEV
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

NOTE.—Based on data from this study and Bawden et al. (2000), Iyer, Aravind, and Koonin (2001), and Raoult et al. (2004). ASFV, African swine fever virus; NTP, nucleoside triphosphate; dUTPase, deoxyuridine triphosphatase; ATP, adenosine triphosphate.

a

Gene ID for EhV-86.

b

Reclassified as Group II core genes.

Identification of Core NCLDV Genes

BlastP searches of EhV-86 coding sequence (CDS) against NCLDV genomes were performed in order to identify NCLDV core genes. Homologues were identified for 9/9 Group I genes, 7/8 Group II genes, and 9/14 Group III genes in the EhV-86 genome (table 1). The pattern of presence/absence of Group I, II, and III core genes for the three sequenced phycodnavirus genomes (EhV-86, PBCV-1, and ESV-1), the mimivirus, and two examples from the iridoviruses (IIV-6 and LCDV), entomopoxviruses (AMEV and MSEV), and chordopoxviruses (VACV and MOCV) is shown in table 1.

Homologues for Group I genes were easily identified by a basic BlastP search, except for ehv141 which showed a single significant match (E = 1 × 10−44) to a Chilo iridescent virus genome helicase, 030L. A further PSI-Blast search revealed a significant hit to mimivirus R350, previously identified as a VV6R-type helicase (Raoult et al. 2004). The only class II gene missing in EhV-86 encodes a putative myristolyated protein. A BlastP search did reveal a weak hit (E = 1.5) to a putative myristolyated protein in the mimivirus, but further PSI-Blast searches failed to confirm this. Therefore, in common with ESV-1, a significant myristolyated protein homologue has not been identified in the EhV-86 genome (table 1).

In common with the other NCLDV genomes, EhV-86 has a distinctive pattern for the presence/absence of Group III NCLDV genes. Of the five Group III genes that both the two previously sequenced Phycodnaviridae are missing, the EhV-86 genome contains homologues for four of them. Three of these genes are core transcriptional–related genes (RNA polymerase subunits rpb1, rpb2, and rpb10 encoded by ehv064, ehv434, and ehv167, respectively) which have hitherto not been found before in the Phycodnaviridae. The identification of these genes indicates that the transcription of at least some EhV-86 genes can occur in the cytoplasm, an enticing hypothesis added further credence by the confirmation of expression of CDSs found with a putative unique promoter element within the EhV-86 genome (Allen, Schroeder, and Wilson 2005; Wilson et al. 2005a). Indeed, not only are these three Group III core transcription–related genes present but other RNA polymerase subunit homologues rpb3 (ehv399), rpb5 (a Group IV NCLDV gene, ehv108), and rpb6 (ehv458) are also contained within the EhV-86 genome. Phylogenetic analysis of the RNA polymerase subunit genes shows that they are derived from the ancestral NCLDV RNA polymerase genes and are unlikely to have been acquired by horizontal gene transfer (fig. 1). The identification of RNA polymerase in EhV-86 also has implications for the present classification system of core genes. The genes for RNA polymerase subunits 1, 2, and 10 (and also the baculovirus inhibitor of apoptosis protein repeat [BIR] domain–containing gene) should now be regarded as Group II and not Group III core NCLDV genes because they are now known to be present in the Phycodnaviridae family (table 1).

FIG. 1.—

Phylogenetic inference tree based on a distance matrix algorithm between the conserved domains from the two largest RNA polymerase subunits from members of the NCLDV group (Neighbor, in PHYLIP version 3.6b). Numbers at nodes indicate bootstrap values retrieved from 100 replicates for both the neighbor-joining and parsimony analyses. The bar depicts 1 base substitution per 10 amino acids.

FIG. 1.—

Phylogenetic inference tree based on a distance matrix algorithm between the conserved domains from the two largest RNA polymerase subunits from members of the NCLDV group (Neighbor, in PHYLIP version 3.6b). Numbers at nodes indicate bootstrap values retrieved from 100 replicates for both the neighbor-joining and parsimony analyses. The bar depicts 1 base substitution per 10 amino acids.

The distinctive pattern of core gene loss/retention in NCLDV genomes suggests a complicated history of independent gene loss events and makes the reconstruction of the Phycodnaviridae lineage in particular (on the basis of gene loss events) difficult (table 1). In order to shed light on the history of the Phycodnaviridae, we performed phylogenetic characterization using the conserved domains from conserved core genes. This approach has been successfully used in many studies to determine phylogenetic relationships among the NCLDV genomes (Iyer, Aravind, and Koonin 2001; Raoult et al. 2004). By including sequence from the EhV-86 genome, we aimed to further define these relationships. Previously, the mimivirus has been placed equidistant between members of the Iridoviridae and the Phycodnaviridae (Raoult et al. 2004). The presence of the RNA polymerase in the EhV-86 genome initially suggested to us that EhV-86 may form a missing link between the Mimiviridae and the Phycodnaviridae. However, this does not appear to be the case (fig. 2). The phylogenetic analysis shows the mimivirus diverging independently from the common “NCLDV ancestor” and not clustering with the family Phycodnaviridae. Clustering did occur between the Iridoviridae and Mimiviridae, suggesting that the mimivirus may be closer to the highly diverse Iridovirus family. However, this relationship is still spurious because the bootstrap values at the node of this cluster are weaker (fig. 2). Phylogenetic analysis using only the concatenated sequence from the conserved domains from the Group I core genes produced a tree identical in shape. However, the addition of the RNA polymerase subunits increased the distance between the branching of EhV-86 from PBCV-1 and ESV-1 (fig. 1). Consequently, the addition of the RNA polymerase subunits did not bias or skew the relationships between the NCLDV families.

FIG. 2.—

Phylogenetic inference tree based on a distance matrix algorithm between the concatenated conserved domains from A18-like helicase, D6R-like helicase, A32-like ATPase, D5-like ATPase, DNA polymerase, thiol-oxidoreductase, and the two largest RNA polymerase subunits from members of the NCLDV group (Neighbor, in PHYLIP version 3.6b). Numbers at nodes indicate bootstrap values retrieved from 100 replicates for both the neighbor-joining and parsimony analyses. The boxed insert depicts the relationship between the three members of the family Phycodnaviridae when the RNA polymerase subunits are removed from the alignment. The bar depicts 1 base substitution per 10 amino acids.

FIG. 2.—

Phylogenetic inference tree based on a distance matrix algorithm between the concatenated conserved domains from A18-like helicase, D6R-like helicase, A32-like ATPase, D5-like ATPase, DNA polymerase, thiol-oxidoreductase, and the two largest RNA polymerase subunits from members of the NCLDV group (Neighbor, in PHYLIP version 3.6b). Numbers at nodes indicate bootstrap values retrieved from 100 replicates for both the neighbor-joining and parsimony analyses. The boxed insert depicts the relationship between the three members of the family Phycodnaviridae when the RNA polymerase subunits are removed from the alignment. The bar depicts 1 base substitution per 10 amino acids.

It appears as if the ancestral Phycodnaviridae lineage diverged with one branch giving rise to EhV-86 and the second branch giving rise to the PBCV-1 and ESV-1 lineages (fig. 2). We suggest that the trigger for this divergence was the loss of RNA polymerase function (through the loss of one or many RNA polymerase subunits). The change in lifestyle represented by this loss (i.e., nuclear-independent to nuclear-dependent transcription) could account for the high diversity among present-day Phycodnaviridae genera. Due to the presence of the RNA polymerase subunits, we believe, of all the phycodnaviruses sequenced to date, EhV-86 represents the virus with the lifestyle most similar to the ancestral phycodnavirus. The sequencing of more Phycodnaviridae genomes, in particular from the Prasinovirus, Raphidovirus, and Prymnesiovirus genera, will shed further light on this topic.

Distribution of NCLDV Homologues

We identified 25 Group I, II, and III genes in the EhV-86 genome. Intriguingly, these genes are physically located between 0–156 kbp and 330–407 kbp, with none found in the region 156–330 kb (fig. 3). This core gene–sparse region was previously identified as containing noncoding repeat elements, likely to be promoters, directly upstream of the start site of 87 predicted CDSs (Allen, Schroeder, and Wilson 2005; Wilson et al. 2005a) (fig. 3). Because strong expression has been shown for CDSs in this 100-kb region, these CDSs must play a crucial role(s) during infection by EhV-86 (Wilson et al. 2005a). Annotation of genes in this region reveals the vast majority of CDSs is of unknown function with little or no homology to anything in the GenBank database. Only one CDS has significant sequence similarity in the other NCLDV genomes; ehv230 which has a single hit (E = 2 × 10−15) to PBCV-1 A50L. This region, however, has similar G + C content and codon usage to the rest of the genome. We postulate that an ancestral EhV-86 genome acquired this region from an as yet unknown source at some point after the Coccolithoviridae lineage diverged from the ancestral phycodnavirus.

FIG. 3.—

Circular representation of the 407,339 bp EhV-86 genome. The outside scale is numbered clockwise in kilobase pair. Circles 1 and 2 (from outside in) are CDSs (forward and reverse strands, respectively) color-coded by putative function: light green, no known function; dark green, no known function, but contains transmembrane helices; gray, miscellaneous; sky blue, degradation of large molecules; red, information transfer; yellow, central or intermediary metabolism; pink, virus specific; and light blue, kinases. Circles 3 and 4 are core NCLDV genes (forward and reverse strands, respectively, shown in bright pink). Circle 5 shows the positions of the putative promoter elements known as Family A repeats (green). Circle 6 shows the position of CCNCCNCCN repeats (in blue) known as Family B repeats. Circle 7, G + C content.

FIG. 3.—

Circular representation of the 407,339 bp EhV-86 genome. The outside scale is numbered clockwise in kilobase pair. Circles 1 and 2 (from outside in) are CDSs (forward and reverse strands, respectively) color-coded by putative function: light green, no known function; dark green, no known function, but contains transmembrane helices; gray, miscellaneous; sky blue, degradation of large molecules; red, information transfer; yellow, central or intermediary metabolism; pink, virus specific; and light blue, kinases. Circles 3 and 4 are core NCLDV genes (forward and reverse strands, respectively, shown in bright pink). Circle 5 shows the positions of the putative promoter elements known as Family A repeats (green). Circle 6 shows the position of CCNCCNCCN repeats (in blue) known as Family B repeats. Circle 7, G + C content.

Although we cannot as yet discount the possibility that transfer occurred from the host to the virus, it is unlikely because no signal for these CDSs was detected in uninfected E. huxleyi cells during an EhV-86 microarray analysis (Wilson et al. 2005a). This entire region would have to have been subsequently lost to the E. huxleyi lineage or placed under such high evolutionary pressure in EhV-86 (and the transfer occurred sufficiently long ago) to allow sufficient divergence in DNA sequence to have occurred to account for the complete lack of detection in the EhV-86 microarray. Furthermore, the recent sequencing of over 1,500 E. huxleyi expressed sequence tags (ESTs) failed to provide a significant match to any CDS in the EhV-86 genome (Wahlund et al. 2004). However, this will only be resolved when the complete genomic sequence of E. huxleyi is known.

Conclusions

The presence of six RNA polymerase subunits in the EhV-86 genome clearly shows a unique lifestyle for this Coccolithovirus. Whereas the previously sequenced Phycodnaviridae viruses appear to have (on the basis of their genomic content) predominantly nuclear lifestyles, it appears that EhV-86 has the capacity, at least, to transcribe parts of its genome in the cytoplasm. This clearly shows the presence of distinct subfamilies within the Phycodnaviridae family. We predict the Coccolithoviridae will eventually be renamed as the Coccolithovirinae to clearly identify them as a subfamily within the Phycodnaviridae. Furthermore, we have identified a 100-kbp region of the EhV-86 genome in which the CDSs have little or no homology to anything in the databases. No conserved core genes are found in this region. It is therefore likely that this region was acquired by an ancestral Coccolithovirus genome at some point after the divergence of the Coccolithovirus genus from the other Phycodnaviridae genera. Clearly, the evolution of the Coccolithovirus is complex, and the relevance of this large transfer of genomic information must be determined. The sequencing of more Coccolithoviridae genomes will provide further insights into this unique virus genus. As further genomic characterization of the Phycodnaviridae family is performed, we hypothesize that the current genera within the family will be recognized as distinct subfamilies in their own right on the basis of their high diversity.

Charles Delwiche, Associate Editor

The research was supported by the Environmental Genomics community program, funded by the Natural Environmental Research Council of the United Kingdom (NERC), through award number NE/A509332/1 to W.H.W. D.C.S. is a Marine Biological Association of the United Kingdom Research Fellow funded by grant in aid from the NERC. W.H.W. is supported through the NERC-funded core strategic research program of the Plymouth Marine Laboratory.

References

Afonso, C. L., E. R. Tulman, Z. Lu, E. Oma, G. F. Kutish, and D. L. Rock.
1999
. The genome of Melanoplus sanguinipes entomopoxvirus.
J. Virol.
 
73
:
533
–552.
Afonso, C. L., E. R. Tulman, Z. Lu, L. Zsak, G. F. Kutish, and D. L. Rock.
2000
. The genome of fowlpox virus.
J. Virol.
 
74
:
3815
–3831.
Afonso, C. L., E. R. Tulman, Z. Lu, L. Zsak, F. A. Osorio, C. Balinsky, G. F. Kutish, and D. L. Rock.
2002
. The genome of swinepox virus.
J. Virol.
 
76
:
783
–790.
Allen, M. J., D. C. Schroeder, and W. H. Wilson.
2005
. Identification and preliminary characterisation of three distinct repeat families within the genome of Emiliania huxleyi Virus 86. Arch. Virol. (in press).
Bamford, D. H., R. M. Burnett, and D. I. Stuart.
2002
. Evolution of viral structure.
Theor. Popul. Biol.
 
61
:
461
–470.
Bawden, A. L., K. J. Glassberg, J. Diggans, R. Shaw, W. Farmerie, and R. W. Moyer.
2000
. Complete genomic sequence of the Amsacta moorei entomopoxvirus: analysis and comparison with other poxviruses.
Virology
 
274
:
120
–139.
Brunetti, C. R., H. Amano, Y. Ueda, J. Qin, T. Miyamura, T. Suzuki, X. Li, J. W. Barrett, and G. McFadden.
2003
. Complete genomic sequence and comparative analysis of the tumorigenic poxvirus Yaba monkey tumor virus.
J. Virol.
 
77
:
13335
–13347.
Cameron, C., S. Hota-Mitchell, L. Chen, J. Barrett, J. X. Cao, C. Macaulay, D. Willer, D. Evans, and G. McFadden.
1999
. The complete DNA sequence of myxoma virus.
Virology
 
264
:
298
–318.
Delaroque, N., D. G. Muller, G. Bothe, T. Pohl, R. Knippers, and W. Boland.
2001
. The complete DNA sequence of the Ectocarpus siliculosus virus EsV-1 genome.
Virology
 
287
:
112
–132.
Delhon, G., E. R. Tulman, C. L. Afonso, Z. Lu, A. de la Concha-Bermejillo, H. D. Lehmkuhl, M. E. Piccone, G. F. Kutish, and D. L. Rock.
2004
. Genomes of the parapoxviruses ORF virus and bovine papular stomatitis virus.
J. Virol.
 
78
:
168
–177.
Felsenstein, J.
1989
. PHYLIP—phylogeny inference package (version 3.2).
Cladistics
 
5
:
164
–166.
Filee, J., P. Forterre, and J. Laurent.
2003
. The role played by viruses in the evolution of their hosts: a view based on informational protein phylogenies.
Res. Microbiol.
 
154
:
237
–243.
Goebel, S. J., G. P. Johnson, M. E. Perkus, S. W. Davis, J. P. Winslow, and E. Paoletti.
1990
. The complete DNA sequence of vaccinia virus.
Virology
 
179
:
247
–266, 517–263.
Iyer, L. M., L. Aravind, and E. V. Koonin.
2001
. Common origin of four diverse families of large eukaryotic DNA viruses.
J. Virol.
 
75
:
11720
–11734.
Jakob, N. J., K. Muller, U. Bahr, and G. Dara.
2001
. Analysis of the first complete DNA sequence of an invertebrate Iridovirus: coding strategy of the genome of Chilo iridescent virus.
Virology
 
286
:
182
–196.
Jancovich, J. K., J. Mao, V. G. Chinchar et al. (11 co-authors).
2003
. Genomic sequence of a ranavirus (family Iridoviridae) associated with salamander mortalities in North America.
Virology
 
316
:
90
–103.
Li, Y., Z. Lu, L. Sun, S. Ropp, G. F. Kutish, D. L. Rock, and J. L. Van Etten.
1997
. Analysis of 74 kb of DNA located at the right end of the 330-kb chlorella virus PBCV-1 genome.
Virology
 
237
:
360
–377.
Page, R. D.
1996
. TreeView: an application to display phylogenetic trees on personal computers.
Comput. Appl. Biosci.
 
12
:
357
–358.
Raoult, D., S. Audic, C. Robert, C. Abergel, P. Renesto, H. Ogata, B. La Scola, M. Suzan, and J. M. Claverie.
2004
. The 1.2-megabase genome sequence of Mimivirus.
Science
 
306
:
1344
–1350.
Schroeder, D. C., J. Oke, G. Malin, and W. H. Wilson.
2002
. Coccolithovirus (Phycodnaviridae): characterisation of a new large dsDNA algal virus that infects Emiliania huxleyi.
Arch. Virol.
 
147
:
1685
–1698.
Senkevich, T. G., J. J. Bugert, J. R. Sisler, E. V. Koonin, G. Darai, and B. Moss.
1996
. Genome sequence of a human tumorigenic poxvirus: prediction of specific host response-evasion genes.
Science
 
273
:
813
–816.
Shackelton, L. A., and E. C. Holmes.
2004
. The evolution of large DNA viruses: combining genomic information of viruses and their hosts.
Trends Microbiol.
 
12
:
458
–465.
Tan, W. G., T. J. Barkman, V. Gregory Chinchar, and K. Essani.
2004
. Comparative genomic analyses of frog virus 3, type species of the genus Ranavirus (family Iridoviridae).
Virology
 
323
:
70
–84.
Tidona, C. A., and G. Darai.
1997
. The complete DNA sequence of lymphocystis disease virus.
Virology
 
230
:
207
–216.
Tulman, E. R., C. L. Afonso, Z. Lu, L. Zsak, J. H. Sur, N. T. Sandybaev, U. Z. Kerembekova, V. L. Zaitsev, G. F. Kutish, and D. L. Rock.
2002
. The genomes of sheeppox and goatpox viruses.
J. Virol.
 
76
:
6054
–6061.
Wahlund, T. M., A. R. Hadaegh, R. Clark, B. Nguyen, M. Fanelli, and B. A. Read.
2004
. Analysis of expressed sequence tags from calcifying cells of marine coccolithophorid (Emiliania huxleyi).
Mar. Biotechnol. (NY).
 
6
:
278
–290.
Wilson, W. H., D. C. Schroeder, M. J. Allen et al. (17 co-authors).
2005
a. Complete genome sequence and lytic phase transcription profile of a Coccolithovirus.
Science
 
309
:
1090
–1092.
Wilson, W. H., J. L. Van Etten, D. S. Schroeder, K. Nagasaki, C. Brussaard, N. Delaroque, G. Bratbak, and C. Suttle.
2005
b. Family: Phycodnaviridae. Pp. 163–175 in C. M. Fauquet, M. A. Mayo, J. Maniloff, U. Dusselberger, and L. A. Ball, eds. Virus taxonomy, VIIIth ICTV report. Elsevier/Academic Press, London.
Yanez, R. J., J. M. Rodriguez, M. L. Nogal, L. Yuste, C. Enriquez, J. F. Rodriguez, and E. Vinuela.
1995
. Analysis of the complete nucleotide sequence of African swine fever virus.
Virology
 
208
:
249
–278.

Author notes

*Plymouth Marine Laboratory, Prospect Place, The Hoe, Plymouth, United Kingdom; †Marine Biological Association, Citadel Hill, Plymouth, United Kingdom; and ‡The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom