Abstract

We recently reported phylogenetic evidence to support the presence of enzootic transmission foci of yellow fever virus (YFV) in Peru [Bryant et al., Emerg. Infect. Dis. (2003)]. Because the prevailing paradigm of YFV transmission in Brazil is that of ‘wandering epizootics’ rather than discrete enzootic foci, we have now compared the molecular phylogenies of YFV isolates from Peru and Brazil, and re-examined the question of virus mobility by mapping the spatio-temporal distribution of genetic variants from these areas. Sequences were obtained for two genomic regions from 50 strains of YFV collected between 1954 and 2000 comprising 223 codons of the structural proteins (premembrane and envelope genes, ‘prM/E’), and a distal region spanning the carboxy terminus of NS5 and part of the 3′ non-coding region (‘EMF’). Peruvian and Brazilian isolates formed two monophyletic clades with no evidence to support recombination between lineages. Variation within both coding and non-coding regions revealed similar substitution rates and overall levels of diversity within each clade. The branching structure of the prM/E and EMF trees of Brazilian sequences showed strong agreement of intra-lineage relationships; in contrast, the EMF sequences of Peruvian isolates failed to fully support the subclade structure of the prM/E phylogeny. These phylogenies suggest that transmission cycles of YFV in Peru and Brazil may sometimes be locally maintained within specific locales, but have also on occasion become very widely dispersed.

Introduction

Yellow fever (YF) is an important arboviral disease that has re-emerged in South America and Africa in the last two decades [2,3]. Despite a highly effective vaccine, there has been an upsurge of YF activity in Peru and Brazil in the last decade. Over 2000 cases have been reported in South America since 1998, and reports of confirmed cases are believed to vastly under-report the true incidence of disease [4]. Urban YF (i.e. transmission of yellow fever virus (YFV) by the peridomestic mosquito Aedes aegypti) has not been reported in Brazil since 1942 [5]. However, many densely populated coastal cities in South America are infested by A. aegypti. Surveillance and monitoring of YF endemic/epidemic viral activity is thus a critically important public health objective.

YFV is the prototype member of the genus Flavivirus, family Flaviviridae. It has a positive-sense, single-stranded RNA genome of approximately 10 kb encoding a single long open reading frame (ORF) that is cleaved into three structural (C, prM, E) and seven non-structural (NS1, NS2A, NS2B, NS3, NS4A, NS4B, NS5) proteins. In addition, the genome contains 3′ and 5′ non-coding regions that are essential to virus replication, and are suggested to contain important determinants of viral infectivity for various mosquito species [6]. Early studies of YFV strains from different geographic locations established the existence of distinctive variants among New and Old World isolates based on immunochemical properties of the E protein [7,8]. More recently, nucleotide sequencing studies of structural gene regions [9] and NS4A and 3′NCR regions [6] delineated seven genotypes of YFV worldwide: five genotypes in Africa, and two in South America. The Brazilian and Peruvian YFVs represent the two major South American YFV genotypes I and II, respectively.

The principal vector of YFV in South America is Haemagogus janthinomys; however, other species of this genus and also of the genus Sabethes play a role in the maintenance cycle of YFV. Monkeys are believed to be the main hosts and the source of amplification of the virus [10,11]. The most widely accepted paradigm of YFV ecoepidemiology is that of ‘epizootic waves’ in which the virus reservoir is ‘constantly moving’ from place to place rather than being maintained over time in the same location. The periodicity of YF epidemics and fluctuations of transmission intensity of the virus have been attributed to the level of immunity in human and simian populations [5,10,12–15], as well as climatic factors affecting vector populations [16].

Our laboratory has recently studied the geographic and temporal distribution of YF variants in Peru, and shown that rather than circulating (‘wandering’) as one intermixing population, different subpopulations of YFV appear to persist within discrete foci within the foothills of the Andes high forest. The current study was undertaken to corroborate evidence for geographic subpopulations of the Peruvian YFVs by examining a distal region of the genome. In addition, we wished to compare patterns of genetic diversity among YFV isolates from Brazil and Peru. We examined two regions of the YFV genome, a fragment comprising 233 codons of the structural proteins (premembrane and envelope, prM/E), and a distal region spanning the carboxy terminus of NS5 and part of the 3′ non-coding region (EMF). Our aim was to describe the spatio-temporal distribution of variants, and examine evidence that YFV is maintained within circumscribed foci of endemicity. In addition to shedding light on YFV genetics and evolution, these results may be relevant to targeting vaccination strategies and vector control efforts.

Materials and methods

Virus isolates used in this study

Fifty virus isolates from Peru and Brazil were obtained from the World Arbovirus Reference Collection, at the University of Texas Medical Branch (UTMB), Galveston, TX (Table 1). Fig. 1 depicts the geographic places of origin for the viruses used in this study. The Peruvian viruses were originally isolated within the laboratories of the Instituto Nacional de Salud (INS) and NMRCD in Lima, at intervals from 1977 to 1999. The Peruvian isolates represent seven of the 14 hydrographic river basins identified as YF endemic zones in Peru. With the exception of Peru81b (isolate 1914b) which was collected from a sentinel mouse, all the Peruvian strains were obtained from human clinical samples. A detailed study of the prM/E gene sequences of these isolates was recently reported [1].

Table 1

List of 50 South American yellow fever strains used in this study

Strain ID Sequence ID Passage history Source Department Community 
BeH 111 BRAZIL54 C6/36, SM10 Human Para Oriboca 
BeAR 189 BRAZIL55C SM1 C6/36#1 Sabethes sp. Para Pirelli Marituba 
BeAN 23536 BRAZIL60 SM1 C6/36#1 Monkey-macaco Para Belem Brasilia Km94 
BeAR 46299 BRAZIL62A C6/36#1 Haemagogus sp. Para Belem Brasilia Km94 
BeAR 44824 BRAZIL62B SM1 C6/36#1 Haemagogus sp. Para Belem Brasilia km87 
BeAN 142028 BRAZIL68A C6/36#1 Monkey-macaco Para Abaetetuba 
BeAR 142658 BRAZIL68C SM2, c6/36#1 Haemagogus sp. Para Barcarena 
BeAN 142027 BRAZIL68D Original, C6/36#1 Saguinus midas Para Abaetetuba 
BeH 203410 BRAZIL71 Original, C6/36#1 Human Para Peixe Boi 
BeAR 233164 BRAZIL73A Mosq 4 Haemagogus sp. Goias Pirenopolis 
BeAR 232869 BRAZIL73B Mosq 1, SM2 Haemagogus sp. Goias Faz. Cangalha Formosa 
BeAR 233436 BRAZIL73C Original, C6/36#1 Haemagogus sp. Goias Bela Vista 
BeH 350698 BRAZIL78A SM2, Mosq 1 Human Para Tome Acu 
BeH 379501 BRAZIL80C SM2 Human Maranhao Imperatriz 
BeH 425381 BRAZIL84A Original, C6/36#1 Human Amapa Tribo Oyampi 
BeAR 511437 BRAZIL91A Original, C6/36#1 Haemagogus sp. Para Bacarena 
BeH 511843 BRAZIL91B SM1, C6/36#2 Human Roraima Tribo Yanomamy 
BeAR 512943 BRAZIL92A C6/36#1 Hg. janthinomys Mato Grosso Sidrolandia 
BeAR 513008 BRAZIL92B SM1 C6/36#1 Sabethes sp. Mato Grosso Sidrolandia 
BeH 512722 BRAZIL92C SM2, C6/36#1 Human Mato Grosso Campo Grande 
BeAR 513292 BRAZIL92E SM1 C6/36#1 Sabethes sp. Mato Grosso Jaraguari 
BeAR 527785 BRAZIL94A SM1 C6/36#1 Sabethes sp. Minas Gerais Arinos 
BeAR 527198 BRAZIL94B SM1 C6/36#1 Haemagogus sp. Minas Gerais Arinos 
BeAR 544276 BRAZIL96A SM1 C6/36#1 Haemagogus sp. Rondonia Cabixi 
BeAR 628124 BRAZIL2000A SM1 C6/36#1 Hg. janthinomys Tocantins Parana 
1362/77 PERU77A C6/36#2 Human Ayacucho San Francisco 
1368 PERU77B SM1, Vero1, C6/36#2 Human Ayacucho Tribolina 
1371 PERU77C SM1, Vero1, C6/36#2 Human Ayacucho Chontacocha 
287/78 PERU78 SM1, Mosq 2 Human Ayacucho San Francisco 
R 35740 PERU79 SM1, Mosq 2 Human Ayacucho Alto Montaro 
1899/81 PERU81A SM1 Human Cusco Cusco 
1914b PERU81B MILLC 1, LLCMK2, Vero 1, C6/36#1 Sentinel mouse Cusco Cusco 
ARVO544 PERU95A SM1, Vero1, C6/36#2 Human San Martin Tocache Huaquisha 
HEB4224 PERU95B SM1, C6/36#1 Human San Martin Tocache N.Progresso 
HEB4236 (153) PERU95C C6/36#1 Human Pasco Oxapampa Villa Rica 
149 PERU95D SM1, C6/36#1 Human Pasco No data 
Cepa#2 PERU95E SM1, C6/36#1 Human Puno No data 
Cepa#1 PERU95F C6/36#2 Human Puno No data 
OBS 2240 PERU95G C6/36#2 Human Huanuco Hermil 
OBS 2250 PERU95H SM1, C6/36#1 Human Huanuco Hermil 
HEB 4240 PERU95I C6/36#1, SM1 Human Junin Chachamayo 
HEB 4245 PERU95J SM1, C6/36#1 Human Junin Chachamayo 
HEB 4246 PERU95K SM1, C6/36#1 Human Junin Chachamayo 
OBS 2243 PERU95L SM1, C6/36#1 Human Huanuco No data 
ARV 0548 PERU95M SM1, C6/36#1 Human San Martin Tocache 
OBS 6530 PERU98A SM1, C6/36#1 Human Cusco Echarate 
03–5350–98 PERU98B C6/36#2 Human Cusco Kanaiquinaba 
OBS 6745 PERU98C C6/36#2 Human Cusco Santa Rita-Rio Nanay 
OBS 7904 PERU99 Vero1, c6/36 3 Human San Martin Tarapoto 
Strain ID Sequence ID Passage history Source Department Community 
BeH 111 BRAZIL54 C6/36, SM10 Human Para Oriboca 
BeAR 189 BRAZIL55C SM1 C6/36#1 Sabethes sp. Para Pirelli Marituba 
BeAN 23536 BRAZIL60 SM1 C6/36#1 Monkey-macaco Para Belem Brasilia Km94 
BeAR 46299 BRAZIL62A C6/36#1 Haemagogus sp. Para Belem Brasilia Km94 
BeAR 44824 BRAZIL62B SM1 C6/36#1 Haemagogus sp. Para Belem Brasilia km87 
BeAN 142028 BRAZIL68A C6/36#1 Monkey-macaco Para Abaetetuba 
BeAR 142658 BRAZIL68C SM2, c6/36#1 Haemagogus sp. Para Barcarena 
BeAN 142027 BRAZIL68D Original, C6/36#1 Saguinus midas Para Abaetetuba 
BeH 203410 BRAZIL71 Original, C6/36#1 Human Para Peixe Boi 
BeAR 233164 BRAZIL73A Mosq 4 Haemagogus sp. Goias Pirenopolis 
BeAR 232869 BRAZIL73B Mosq 1, SM2 Haemagogus sp. Goias Faz. Cangalha Formosa 
BeAR 233436 BRAZIL73C Original, C6/36#1 Haemagogus sp. Goias Bela Vista 
BeH 350698 BRAZIL78A SM2, Mosq 1 Human Para Tome Acu 
BeH 379501 BRAZIL80C SM2 Human Maranhao Imperatriz 
BeH 425381 BRAZIL84A Original, C6/36#1 Human Amapa Tribo Oyampi 
BeAR 511437 BRAZIL91A Original, C6/36#1 Haemagogus sp. Para Bacarena 
BeH 511843 BRAZIL91B SM1, C6/36#2 Human Roraima Tribo Yanomamy 
BeAR 512943 BRAZIL92A C6/36#1 Hg. janthinomys Mato Grosso Sidrolandia 
BeAR 513008 BRAZIL92B SM1 C6/36#1 Sabethes sp. Mato Grosso Sidrolandia 
BeH 512722 BRAZIL92C SM2, C6/36#1 Human Mato Grosso Campo Grande 
BeAR 513292 BRAZIL92E SM1 C6/36#1 Sabethes sp. Mato Grosso Jaraguari 
BeAR 527785 BRAZIL94A SM1 C6/36#1 Sabethes sp. Minas Gerais Arinos 
BeAR 527198 BRAZIL94B SM1 C6/36#1 Haemagogus sp. Minas Gerais Arinos 
BeAR 544276 BRAZIL96A SM1 C6/36#1 Haemagogus sp. Rondonia Cabixi 
BeAR 628124 BRAZIL2000A SM1 C6/36#1 Hg. janthinomys Tocantins Parana 
1362/77 PERU77A C6/36#2 Human Ayacucho San Francisco 
1368 PERU77B SM1, Vero1, C6/36#2 Human Ayacucho Tribolina 
1371 PERU77C SM1, Vero1, C6/36#2 Human Ayacucho Chontacocha 
287/78 PERU78 SM1, Mosq 2 Human Ayacucho San Francisco 
R 35740 PERU79 SM1, Mosq 2 Human Ayacucho Alto Montaro 
1899/81 PERU81A SM1 Human Cusco Cusco 
1914b PERU81B MILLC 1, LLCMK2, Vero 1, C6/36#1 Sentinel mouse Cusco Cusco 
ARVO544 PERU95A SM1, Vero1, C6/36#2 Human San Martin Tocache Huaquisha 
HEB4224 PERU95B SM1, C6/36#1 Human San Martin Tocache N.Progresso 
HEB4236 (153) PERU95C C6/36#1 Human Pasco Oxapampa Villa Rica 
149 PERU95D SM1, C6/36#1 Human Pasco No data 
Cepa#2 PERU95E SM1, C6/36#1 Human Puno No data 
Cepa#1 PERU95F C6/36#2 Human Puno No data 
OBS 2240 PERU95G C6/36#2 Human Huanuco Hermil 
OBS 2250 PERU95H SM1, C6/36#1 Human Huanuco Hermil 
HEB 4240 PERU95I C6/36#1, SM1 Human Junin Chachamayo 
HEB 4245 PERU95J SM1, C6/36#1 Human Junin Chachamayo 
HEB 4246 PERU95K SM1, C6/36#1 Human Junin Chachamayo 
OBS 2243 PERU95L SM1, C6/36#1 Human Huanuco No data 
ARV 0548 PERU95M SM1, C6/36#1 Human San Martin Tocache 
OBS 6530 PERU98A SM1, C6/36#1 Human Cusco Echarate 
03–5350–98 PERU98B C6/36#2 Human Cusco Kanaiquinaba 
OBS 6745 PERU98C C6/36#2 Human Cusco Santa Rita-Rio Nanay 
OBS 7904 PERU99 Vero1, c6/36 3 Human San Martin Tarapoto 
Figure 1

Map of Peru and Brazil indicating geographic origins of YFV isolates. Note: enlargements not drawn to scale.

Figure 1

Map of Peru and Brazil indicating geographic origins of YFV isolates. Note: enlargements not drawn to scale.

The 25 Brazilian isolates of this study were isolated at the Instituto Evandro Chagas in Belem, Brazil from 1954 to 2000. Seven of the isolates were obtained from human clinical cases; three were from monkeys, and 15 were mosquito isolates. The method of isolation and subsequent passage history for the virus seed stocks are provided in Table 1; the majority of isolates were prepared through one or two passages in suckling mouse brain followed by a single passage in C6/36 cells.

Reverse-transcription polymerase chain reaction (RT-PCR) and sequencing

Following transfer of the isolates from the World Arbovirus Reference Center, viruses were grown for a single additional passage in Vero cells to obtain sufficient quantities for RNA extraction. Methods for viral growth, genomic RNA extraction, and amplification of viral sequences by RT-PCR have been previously described [17]. The first set of studies involved amplification of a 670-bp fragment comprising the 3′ 108 nucleotides of the membrane (M) protein gene, and the 5′ 337 nucleotides of the envelope (E) protein-coding gene. The amplicons were generated using the genomic-sense primer (5′-CTGTCCCAATCTCAGTCC) and genomic-complementary primer (5′-AATGCTTCCTTTCCCAAAT). The second set of studies involved amplification of a 607-bp fragment comprising the 3′ 297 nucleotides of NS5 at the end of the genomic ORF, and the first 309-bp of the 3′ non-coding region. The primers used to amplify this region were genomic-sense degenerate primer ‘EMF’ (5′-TGGATGACSACKGARGAYAT) and genomic-complementary primer ‘VD8’ (5′-GGGTCTCCTCTAACCTCTAG). PCR products were screened by electrophoresis, recovered from gels using the Qiagen gel extraction kit, and sent for sequencing at the UTMB Protein Chemistry core facility. Sequences were obtained from both strands of each RT-PCR product for verification.

Phylogenetic and statistical analyses

Initial sequence editing and alignments were performed using Vector NTI (Informax®), and manually edited using the GCG Wisconsin Package Version 10.3, (Accelrys, San Diego, CA, USA) and DAMBE package (http://web.hku.hk/~xxia/software/software.htm). The PAUP* program [18] was used to infer maximum likelihood (ML) trees and estimate evolutionary rate parameters for each data set. The model of nucleotide substitution used was the general time-reversible (GTR) model with a different substitution rate for each codon position. For the purpose of rate comparisons, the among-site rate hetereogeneity for Peruvian and Brazilian prM/E and EMF sequences was also estimated using the discretized gamma distribution; however, this parameter was not included in the ML tree construction. Support for individual clades was determined by non-parametric bootstrapping [19]. The PAML package Version 3.13 [20] was used to estimate rates of synonymous and non-synonymous substitution. ML search methods implemented in the PAML package use a model of codon evolution that accounts for the transition/transversion rate with codon usage bias modeled by the nucleotide frequencies at the three codon positions. The Peruvian and Brazilian data sets were evaluated separately under the assumption of a single ratio of non-synonymous to synonymous substitution rates for all lineages.

Results

Sequence variation between Peruvian and Brazilian YFVs

Fig. 2 shows ML phylogenies for YFV isolates based on the prM/E and EMF sequence alignments. Both prM/E and EMF gene trees revealed a consistent pattern of divergence of the Peruvian and Brazilian clades, and the monophyly of these lineages was strongly supported by bootstrap analysis. Table 2 shows the average genetic distances among and between the two clades based on nucleotide and amino acid pairwise comparisons. Fig. 3 shows the amino acid alignment of prM/E sequences, and Fig. 4 shows the nucleotide alignment of the EMF sequences. The genetic diversity within Peru and Brazil was remarkably similar based on both the prM/E and EMF sequences. The prM/E and EMF sequences contained 189 (28.2%) and 135 (22.2%) variable nucleotides, respectively. There were a total of 44 variable amino acid sites within prM/E (19.7% of 223 codons), as compared to 11 variable sites in the NS5 fragment (12.9% of 85 codons). Divergence between the Peruvian and Brazilian prM/E sequences (average 9.6%) was slightly greater than divergence of the EMF sequences (average 8.6%). Pairwise amino acid differences between Peruvian and Brazilian prM/E sequences was 2.8% (range of 0.9–6.6%), which was slightly lower than the corresponding NS5 divergence (average 3.6%, range of 2.3–6.9).

Figure 2

ML trees of YFV isolates from Peru and Brazil based on (panel A) 670 nt of prM/E region; (panel B) 576 nt of EMF region. Trees are rooted with the Asibi reference strain (Ghana27). Horizontal branch lengths represent genetic divergence (numbers of nucleotide substitutions). Numbers above the branch lengths denote bootstrap support (500 replicates). Hu, human isolate; Mk, monkey isolate; Hg, Haemagogus sp. isolate; Hj, Hg. janthinomys isolate; Sa, Sabethes sp. isolate.

Figure 2

ML trees of YFV isolates from Peru and Brazil based on (panel A) 670 nt of prM/E region; (panel B) 576 nt of EMF region. Trees are rooted with the Asibi reference strain (Ghana27). Horizontal branch lengths represent genetic divergence (numbers of nucleotide substitutions). Numbers above the branch lengths denote bootstrap support (500 replicates). Hu, human isolate; Mk, monkey isolate; Hg, Haemagogus sp. isolate; Hj, Hg. janthinomys isolate; Sa, Sabethes sp. isolate.

Table 2

Average genetic distances within and between Peruvian and Brazilian YFV based on prM/E (670 nt) and EMF (576 nt) regions

 prM/E EMF 
 Nucleotide Amino acid Nucleotide Amino acid 
 % (st. dev.) range % (st. dev.) range % (st. dev.) range % (st. dev.) range 
Within Brazil 4.2 (1.7) 0.1–7.6 1.6 (1.2) 0–6.6 3.9 (1.9) 0–7.4 2.0 (1.2) 0–4.6 
Within Peru 4.0(1.5) 0.1–7.3 1.7 (1.0) 0–5.6 4.1 (0.3) 0–10.1 0.75 (.75) 0–3.4 
Between Brazil and Peru 9.6 (0.7) 7.6–11.4 2.8 (0.9) 0.9–6.6 8.6 (1.3) 7.3–10.9 3.6 (1.1) 2.3–6.9 
List of amino acid substitutions within the last 85 codons of NS5 
NS5 position NS5pos Asibi Brazil Peru Other 
 822    
 13 832 except brazil62b, brazil91a, which have M 
 20 839 except brazil73d, brazil94a and 94b 
 49 868 except Brazil92a and b, which have F 
 59 878 except for Peru 77bc, 95abfgm, and 98c, which have N 
 62 881    
 64 883 except Brazil78a, which has A 
 77 896 except Peru81A, which has V 
 prM/E EMF 
 Nucleotide Amino acid Nucleotide Amino acid 
 % (st. dev.) range % (st. dev.) range % (st. dev.) range % (st. dev.) range 
Within Brazil 4.2 (1.7) 0.1–7.6 1.6 (1.2) 0–6.6 3.9 (1.9) 0–7.4 2.0 (1.2) 0–4.6 
Within Peru 4.0(1.5) 0.1–7.3 1.7 (1.0) 0–5.6 4.1 (0.3) 0–10.1 0.75 (.75) 0–3.4 
Between Brazil and Peru 9.6 (0.7) 7.6–11.4 2.8 (0.9) 0.9–6.6 8.6 (1.3) 7.3–10.9 3.6 (1.1) 2.3–6.9 
List of amino acid substitutions within the last 85 codons of NS5 
NS5 position NS5pos Asibi Brazil Peru Other 
 822    
 13 832 except brazil62b, brazil91a, which have M 
 20 839 except brazil73d, brazil94a and 94b 
 49 868 except Brazil92a and b, which have F 
 59 878 except for Peru 77bc, 95abfgm, and 98c, which have N 
 62 881    
 64 883 except Brazil78a, which has A 
 77 896 except Peru81A, which has V 

Based on NS5 coding region, 255 nt.

Figure 3

Amino acid sequence alignment for the prM/E region of Peruvian and Brazilian YFVs. Dots indicate identity with the Asibi reference sequence shown at top.

Figure 3

Amino acid sequence alignment for the prM/E region of Peruvian and Brazilian YFVs. Dots indicate identity with the Asibi reference sequence shown at top.

Figure 4

Alignment of partial NS5 and 3′NCR nucleotide sequences of Peruvian and Brazilian YFVs. Dots indicate identity with the Asibi reference sequence shown at top. Dashes indicate gaps in the alignment. RYF, imperfect repeat elements.

Figure 4

Alignment of partial NS5 and 3′NCR nucleotide sequences of Peruvian and Brazilian YFVs. Dots indicate identity with the Asibi reference sequence shown at top. Dashes indicate gaps in the alignment. RYF, imperfect repeat elements.

Within the prM/E fragment there was one amino acid site that distinguished all the Peruvian from the Brazilian sequences (E67) (Fig. 3). Based on homology to the West African Asibi reference strain, asparagine is most likely the ancestral residue at E67; all the Brazilians with the exception of Brazil91b retained the ancestral residue, whereas the Peruvian sequences revealed an N→H substitution at this site. Within the last 85 codons of NS5, there were two amino acid sites separating the Peruvian and Brazilian clades: T→V at NS5-822 and K→R at NS5-881. The Peruvians retained the ancestral residue at NS5-881 (i.e. identity with Asibi at this site), whereas the Brazilians shared the ancestral residue at NS5–882. Neither of these conservative substitutions would be predicted to alter polymerase function, as they do not occur within conserved polymerase motifs [21].

Genetic diversity within Peruvian YFVs

We have previously reported that the Peruvian YFV geneology based on prM/E sequences revealed six different subclades that corresponded very closely with the following geographic regions: Puno, Pasco, Junin, Cusco, Ayacucho, and San Martin/Huanuco. Numerous substitutions within the prM/E region delineated these subclades leading to high bootstrap values, and three of the clades shared signature amino acid substitutions (i.e. coding changes in nucleotide sequences shared by all members of the group). In this report we present the EMF sequences for 24 of the 25 Peruvian isolates. We were unable to amplify the EMF sequences of Peru95L (OBS2243) due to poor growth characteristics of this strain in cell culture. A total of 60 nucleotide positions were variable within the Peruvian EMF sequences; 43 of these were informative sites. The informative sites were almost equally divided between the NS5 coding (24 sites) and the 3′NCR portions (19 sites) of the sequence. There were only three variable amino acid sites among the Peruvian NS5 sequences, and only one of these sites was informative. Three of the geographic subclades (Pasco, Junin, and San Martin/Huanuco) identified by the prM/E tree showed corresponding relationships on the EMF tree with significant bootstrap support. These clades were delineated by silent substitutions; however, in contrast to the prM/E data set, there are no signature amino acid sites within the short NS5 fragment to suggest similar subclades. EMF sequences of the strains from Cusco and Ayacucho were not monophyletic, and the two isolates from Puno and one isolate from Huanuco shifted positions on the EMF tree. Interestingly, there is a common amino acid substitution at NS5 878 (K→N) shared by seven of the isolates that were not previously believed to be closely related (four from San Martin, two from Ayacucho, one from Cusco). Whether the discordance of the prM/E and EMF genealogies is indicative of intra-lineage recombination events is unclear and requires further examination.

Genetic diversity within Brazilian YFVs

There were a total of 121 variable nucleotide positions among the Brazilian prM/E sequences, as compared to 131 variable positions within the corresponding EMF sequences. Sixty-eight of the nucleotide positions were parsimony-informative in the case of prM/E, whereas 54 were informative in the EMF region (27 falling within NS5). A total of 19 variable amino acid positions in the prM/E (8.5% of the 223 codons) were scattered throughout the prM, M and E proteins, as compared to eight positions in the partial NS5 fragment (9.4% of 85 codons). Although the overall genetic variability among the Brazil sequences exceeded that of Peru, it is important to note that the Brazilian isolates represented a much larger geographic area as well as a longer time frame (1954–2000). Amino acid pairwise divergence among the Brazil prM/E sequences ranged from 0 to 6.6% (mean of 1.8%) as compared to 0–4.6% (mean of 1.9%) within NS5.

Mosquito and vertebrate-derived sequences from Brazilian YFVs appeared to be distributed randomly in both the prM/E and EMF phylogenetic trees (Fig. 2). Phylogenetic trees of Brazilian prM/E and EMF sequences revealed a cluster of isolates from Para state (dating from 1954 to 1968) that differed significantly from all other Brazilian YFV strains. These strains originated from communities close to the Atlantic coast in the region surrounding Belem. The Para cluster shared two signature amino acids within prM/E (at M44 and E83) and is identical to the ancestral West African Asibi sequence at these sites [9]. Although there are no shared amino acid substitutions within NS5 that delineate the Para cluster, bootstrap support for the clade was equally high in the EMF and prM/E trees. Note that not all the isolates from Para were monophyletic; isolates collected from the same geographic region during later periods (in 1971, 1978, and 1991) fell within the lineage containing all other Brazilian strains. The long branch separating the Para cluster from the other Brazilian YFVs reflects the numerous substitutions in prM/E and EMF that separate this subclade from the other isolates.

With the exception of the Para subclade, there were no additional nodes on the Brazilian portion of the prM/E tree showing strong support. The branching pattern among the Brazilian EMF sequences, however, showed close correspondence to relationships indicated by the prM/E tree, and also revealed significant bootstrap values. Thus, two additional subclades within Brazil may be defined: (1) a group comprised of a human isolate collected in 1991 from Roraima, together with five mosquito isolates from Mato Grosso do Sul and Rondonia (1992–1996), and (2) a group comprised of isolates collected from 1978 to 1994 in the northern and central states of Maranhao, Minas Gerais, and Para. The phylogenetic position of the remaining isolates from Goias (three from 1973), Amapa (1984), and Tocantins (2000) was not easily resolved on either the prM/E or EMF trees. Note that the isolate from Tocantins collected in 2000 was the most contemporaneous isolate included in this study, and the first from this region to be sequenced. Interestingly, it was characterized by a very long branch on the EMF tree, as a result of an unusually large number of substitutions.

Estimation of evolutionary parameters from prM/E and EMF sequences

Table 3 presents summary statistics and evolutionary parameter estimates for the YFV isolates based on prM/E and EMF sequences. With the exception of the transition/transversion ratio (κ), and the among-site variability (Γ distribution), parameter estimates for both Brazil and Peru were remarkably similar. Transition/transversion ratios differed between Peru and Brazil for the prM/E sequences, but not for the EMF sequences. This is consistent with the observation that transition/transversion ratios closely reflect substitution rates at third codon positions and are an indication of selectional constraints in coding regions. The shape parameter of the Γ distribution, α, provides a measure of the among-site rate heterogeneity; with the exception of the Peruvian EMF sequences, estimates of α were typical of highly conserved sequences, in which only a few sites exhibit variability. The Peruvian EMF sequences were exceptional insofar as the high α value suggested equal distribution of mutations among sites.

Table 3

Summary statistics and ML parameter estimates for YFV isolates from Peru and Brazil

A: Comparisons based on the full-length prM/E and EMF fragments 
Data set (−ln Lκ %GC Variable sites No. site patterns Γ, α 
    PIS UIS   
prM-E (670 nt)        
Peru 1654 9.98 0.48 59 33 91 0.93 
Brazil 1887 14.05 0.49 68 53 116 1.12 
EMF (576 nt)        
Peru 1166 5.43 0.49 43 61 54 342.00 
Brazil 1375 4.13 0.50 57 29 86 0.65 
B: Comparisons between prM/E and the coding region of the EMF fragment (NS5) 
Data set  n Relative site rates ω S 
   1st cp 2nd cp 3rd cp   
prM-E (223 codons)        
Peru  25 0.65 0.29 2.06 0.16 0.58 
Brazil  25 0.44 0.24 2.32 0.11 0.80 
NS5 (85 codons)        
Peru  24 0.37 0.07 2.55 0.03 0.49 
Brazil  25 0.62 0.28 2.08 0.10 0.54 
A: Comparisons based on the full-length prM/E and EMF fragments 
Data set (−ln Lκ %GC Variable sites No. site patterns Γ, α 
    PIS UIS   
prM-E (670 nt)        
Peru 1654 9.98 0.48 59 33 91 0.93 
Brazil 1887 14.05 0.49 68 53 116 1.12 
EMF (576 nt)        
Peru 1166 5.43 0.49 43 61 54 342.00 
Brazil 1375 4.13 0.50 57 29 86 0.65 
B: Comparisons between prM/E and the coding region of the EMF fragment (NS5) 
Data set  n Relative site rates ω S 
   1st cp 2nd cp 3rd cp   
prM-E (223 codons)        
Peru  25 0.65 0.29 2.06 0.16 0.58 
Brazil  25 0.44 0.24 2.32 0.11 0.80 
NS5 (85 codons)        
Peru  24 0.37 0.07 2.55 0.03 0.49 
Brazil  25 0.62 0.28 2.08 0.10 0.54 

κ, transition/transversion rate ratio; PIS, parsimony informative sites; UIS, parsimony uninformative sites; Γ, α shape parameter of the gamma distribution. n, number of sequences; relative substitution rates for each codon position; ω(dn/ds), non-synonymous/synonymous rate ratio, averaged over sites; S, tree length, number of nucleotide substitutions along the tree per codon; [κ for Peru, based on NS5=38]; [κ for Brazil, based on NS5=26].

It is worth noting that all of the prM/E and EMF sequences exhibited a predominance of C–U substitutions. Similar observations of sequences rich in C–U transitions have also been observed for the 3′NCR of the related flavivirus, West Nile virus [22,23]. Interestingly, the nucleotide base composition of the sequenced mitochondrial segments of many insect species has a high A+T content (as high as 75–78% for A. aegypti), and the most frequent transversions in these species are of the A–T type [24]. Differences in the observed transversion frequencies between Aedes and Culex mosquitoes, for instance, were used to infer relative genetic distances between the mosquito genera. Among the YFV sequences in this study, base frequencies did not appear to be A–U rich, and there was no noticeable trend among transversion frequencies.

Discussion

Our previous analysis of YF in Peru provided both epidemiological and phylogenetic evidence to suggest that YFVs in the high forests of the Andes circulate within discrete enzootic foci [1]. Our ability to discern separate subclades of YFV within Peru appeared to indicate low levels of population intermixing between viruses from adjacent river basins. Given the extremely complex topography of the Peruvian Andes, as well as the numerous centers of species endemism that have been observed for taxa of flora and fauna in the region [25], we hypothesized that genetic isolation by distance could explain the observed molecular diversity of the YFVs, and that biogeographic barriers had helped to shape the evolution of the virus. The current study was undertaken to confirm evidence for geographic subtyping of the Peruvian strains by examining a distal region of the genome (e.g. the EMF fragment). In addition, we wished to address whether YFV circulation in Brazil exhibited a similar pattern of population substructure. We reasoned that the molecular signature of virus populations circulating in enzootic foci would differ markedly from that of viruses transmitted via ‘wandering epizootics’, and thus might be discernible through a comparative study of virus phylogenies.

Our data revealed that Brazilian and Peruvian YFVs are significantly divergent virus lineages that can be differentiated on the basis of molecular markers in both the structural proteins and the 3′ non-coding regions. Analysis of EMF sequences from the Peruvian isolates failed to fully support the geographic subclades previously delineated by the prM/E tree. In particular, relationships among strains from Ayacucho, Cusco, and Puno were not confirmed by the EMF tree and further studies are necessary to fully understand the significance of these results.

Comparison of the Peruvian and Brazilian sequences revealed interesting differences in the spatial distribution of variants within the two regions. The Brazilian prM/E sequences revealed a branching pattern that suggested the possibility of widely dispersed epizootics; isolates collected over very large distances within Brazil appeared in some instances within the same subclade (e.g. Brazil91B and Brazil96A; Brazil78A and Brazil94A). It was also the case, however, that some closely related variants appeared to have persisted for as long as 20 years within the same locale (e.g. Brazil71 and Brazil91A). In contrast to the Peruvian gene trees, the Brazilian prM/E and EMF trees showed very close agreement with no discordance in the placement of individual isolates. Although bootstrap support for subclades varied between the trees, the genetic relationships among the strains were upheld by analysis of both genomic regions.

It is apparent that using the existing set of YFV sequences from Peru and Brazil, it is difficult to establish a clear and consistent explanation for the observed molecular diversity. The broad distribution of genetic variants across widely different ecological zones in Brazil, and the lack of clear temporal clustering of strains, suggests a very complex pattern of virus transmission that could be considered consistent with the hypothesis of wandering epizootics.

Discrepancies regarding geographic clustering, and evidence for enzootic foci in Peru may reflect the absence of sufficient phylogenetic signal in the EMF sequences, or ascertainment bias in the choice of genomic region for sequencing. Alternatively, these discrepancies could also result from differences in the relative population sizes (transmission intensities) of different subclades of virus. It is important to note that none of the 50 South American YFV isolates studied to date provided evidence to suggest recombination between the Peruvian and Brazilian lineages. However, rare recombination events or even a very small amount of migration between adjacent watersheds could easily obscure the phylogenetic signal of population subdivision. Sequencing complete genomes of representative strains may be required to resolve phylogenetic relationships among these strains.

In summary, we have described the considerable genetic variability among circulating YFVs in Peru and Brazil, and found that some virus variants appear able to persist within circumscribed foci whereas other variants appear to have dispersed over thousands of kilometers. Given the potential threat to public health from re-urbanization of the disease, it will be crucial to improve understanding of the processes controlling YFV evolution. Biological and phenotypic characterization of the YFV genetic variants would also represent an important step forward towards elucidating the ecological implications of the underlying genetic variation.

Acknowledgments

We thank Drs. Robert Tesh and Pedro F.C. Vasconcelos who provided virus isolates from collections maintained at the World Arbovirus Reference Center at the University of Texas Medical Branch, in Galveston, Texas, and the Instituto Evandro Chagas, in Belèm, Brazil, respectively. This work was supported by NIH grant AI 10986, by a Zelda Zinn Scholarship award to JEB, and the CDC training grant T01/CCT622892–01.

References

[1]
Bryant
J.E.
Wang
H.
Cabezas
C.
Ramirez
G.
Watts
D.
Russell
K.
et al
(
2003
)
Enzootic transmission of yellow fever virus in Peru
.
Emerg. Infect. Dis.
 
9
,
926
933
.
[2]
Robertson
S.E.
Hull
B.P.
Tomori
O.
Bele
O.
LeDuc
J.W.
Esteves
K.
(
1996
)
Yellow fever: a decade of reemergence
.
JAMA
 
276
,
1157
1162
.
[3]
Monath
T.P.
(
1999
)
Facing up to re-emergence of urban yellow fever
.
Lancet
 
353
,
1541
.
[4]
PAHO
(
2001
)
Peru Country Profile
 .
Pan American Health Organization
,
Washington, DC
.
[5]
Monath
T.P.
(
1997
)
Epidemiology of yellow fever: current status and speculations on future trends
. In:
Factors in the Emergence of Arbovirus Diseases
  (
Saluzzo
J.F.
, Ed.), pp.
143
165
.
Elsevier
,
Amsterdam
.
[6]
Wang
E.
Weaver
S.C.
Shope
R.E.
Tesh
R.B.
Watts
D.M.
Barrett
A.D.
(
1996
)
Genetic variation in yellow fever virus: duplication in the 3′ noncoding region of strains from Africa
.
Virology
 
225
,
274
281
.
[7]
Lepiniec
L.
Dalgarno
L.
Huong
V.T.
Monath
T.P.
Digoutte
J.P.
Deubel
V.
(
1994
)
Geographic distribution and evolution of yellow fever viruses based on direct sequencing of genomic cDNA fragments
.
J. Gen. Virol.
 
75
,
417
423
.
[8]
Chang
G.J.
Cropp
B.C.
Kinney
R.M.
Trent
D.W.
Gubler
D.J.
(
1995
)
Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus
.
J. Virol.
 
69
,
5773
5780
.
[9]
Mutebi
J.P.
Wang
H.
Li
L.
Bryant
J.E.
Barrett
A.D.
(
2001
)
Phylogenetic and evolutionary relationships among yellow fever virus isolates in Africa
.
J. Virol.
 
75
,
6999
7008
.
[10]
Degallier
N.
Travassos da Rosa
A.P.
Herve
J.P.
Travassos da Rosa
E.S.
Vasconcelos
P.F. Mangabiera de Silva
et al
. (
1992
)
A comparative study of yellow fever in Africa and South America
.
J. Braz. Assoc. Adv. Sci.
 
44
,
143
151
.
[11]
Mondet
B.
Da Rosa
A.P.
Vasconcelos
P.F.
(
1996
)
The risk of urban yellow fever outbreaks in Brazil by dengue vectors Aedes aegypti and Aedes albopictus
.
Bull. Soc. Pathol. Exot.
 
89
,
107
113
.
[12]
Digoutte
J.P.
(
1995
)
Yellow fever
. In:
Exotic Viral Infections
  (
Porterfield
J.S.
, Ed.), pp.
67
102
.
Chapman and Hall
,
London
.
[13]
Chippaux
A.
Deubel
V.
Moreau
J.P.
Reynes
J.M.
(
1993
)
Current situation of yellow fever in Latin America
.
Bull. Soc. Pathol. Exot.
 
86
,
460
464
.
[14]
Prata
A.
(
2000
)
Yellow fever
.
Mem. Inst. Oswaldo Cruz
 
95
,
183
187
.
[15]
Halstead
S.B.
(
1998
)
Emergence mechanisms in yellow fever and dengue
. In:
Emerging Infections 2
  (
Scheld
W.M.
Craig
A.S.
Hughes
J.M.
, Eds.), pp.
65
80
.
ASM Press
,
Washington, DC
.
[16]
Vasconcelos
P.F.
Costa
Z.G.
Travassos Da Rosa
E.S.
Luna
E.
Rodrigues
S.G.
Barros
V.L.
Dias
J.P.
Monteiro
H.A.
Oliva
O.F.
Vasconcelos
H.B.
Oliveira
R.C.
Sousa
M.R. Barbosa Da Silva
(
2000
)
Epidemic of jungle yellow fever in Brazil, 2000: Implications of climatic alterations in disease spread
.
J. Med. Virol.
 
65
,
598
604
.
[17]
Wang
E.
(
1995
)
Studies on the genomes of wild-type and vaccine strains of yellow fever virus
 .
University of Surrey
.
[18]
Swofford
D.L.
(
1998
)
PAUP*
 .
Sinauer Associates
,
Sunderland, MA
.
[19]
Felsenstein
J.
(
1985
)
Confidence limits on phylogenies: an approach using the bootstrap
.
Evolution
 
39
,
783
791
.
[20]
Yang
Z.
Nielsen
R.
Goldman
N.
Pedersen
A.M.
(
2000
)
Codon-substitution models for heterogeneous selection pressure at amino acid sites
.
Genetics
 
155
,
431
449
.
[21]
Koonin
E.V.
(
1991
)
The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses
.
J. Gen. Virol.
 
72
,
2197
2206
.
[22]
Ebel
G.D.
(
2000
)
Partial genetic characterization of West Nile virus strains, New York State
.
Emerg. Infect. Dis.
 
7
,
650
653
.
[23]
Anderson
J.F.
Vossbrinck
C.R.
Andreadis
T.G.
Iton
A.
Beckwith
W.H.
Mayo
D.R.
(
2003
)
A phylogenetic approach to following West Nile virus in Connecticut
.
Proc. Natl. Acad. Sci.
 
98
,
12885
12889
.
[24]
Shouche
Y.
Patole
M.S.
(
2000
)
Sequence analysis of mitochondrial 16S ribosomal RNA gene fragment from seven mosquito species
.
J. Biosci.
 
25
,
361
366
.
[25]
Young
K.
Leon
B.
(
1999
)
Peru's humid eastern montane forests: An overview of their physical settings, biological diversity, human use and settlement, and conservation needs
.
Technical Report No. 5
 ,
Centre for Research on the Cultural and Biological Diversity of Andean Rainforests (DIVA)
.