Acquisition and clearance of cervical human papillomavirus (HPV) infection were analyzed among 1425 low-income women attending a maternal and child health program in São Paulo, Brazil. Specimens collected every 4 months were tested by a polymerase chain reaction protocol (MY09/11). In all, 357 subjects were positive at least once. There were 1.3% new infections per month, with 38% cumulative positivity after 18 months. Of 177 positive subjects at enrollment, only 35% remained infected after 12 months. The monthly clearance rate was higher for nononcogenic types (12.2%; 95% confidence interval [CI], 9.6–15.4) than for oncogenic HPV infections (9.5%; 95% CI, 7.5–11.9). Median retention times were 8.1 months (95% CI, 7.8–8.3) for oncogenic types and 4.8 months (95% CI, 3.9–5.6) for nononcogenic HPV infections. The mean infection durations were 8.2 and 13.5 months for nononcogenic and oncogenic types, respectively. Although a woman's age did not affect mean duration for oncogenic types (13–14 months), nononcogenic-type infections lasted longer (10.2 months) among younger (<35 years old) than in older women (5.6 months).
The epithelial lining of the anogenital tract is the target for infection by a group of mucosotropic viruses, the human papillomaviruses (HPV). Acquisition of cervical HPV infection is the main biologic precursor of a series of events that leads to cervical cancer, as has been extensively documented by epidemiologic and experimental studies during the past 15 years [1, 2].
In cross-sectional surveys, cervical HPV infection is detected by amplified and nonamplified DNA detection techniques in 5%–40% of asymptomatic women of reproductive age [3, 4]. A few cohort studies indicate that HPV infection is mostly a transient or intermittent phenomenon; only a small proportion of those positive for a given HPV type tend to harbor the same type in subsequent specimens [5–8]. In addition, prospective epidemiologic studies show that the risk of subsequent cervical intraepithelial neoplasia seems to be proportional to the number of specimens testing positive for HPV [9, 10], which suggests that only persistent infections may trigger carcinogenic development. Given that there is now considerable interest in the possibility of using HPV testing as a potential cervical cancer screening tool [11–13], it is imperative that issues related to viral persistence be addressed by epidemiologic studies. Research on the epidemiology of viral persistence and of its determinants will help in the future formulation of algorithms and policies for inclusion of HPV testing in cervical cancer prevention.
In 1993, we began the Ludwig-McGill Cohort Study, a large longitudinal investigation of the natural history of HPV infection and cervical neoplasia in a population of low-income women in São Paulo, one of the highest risk areas worldwide for cervical cancer. It focuses on persistent infection with oncogenic HPV types as the precursor event leading to cervical neoplasia and attempts to understand attributes of the natural history of viral infection that may be helpful for designing primary and secondary strategies for preventing cervical cancer. Here we present the descriptive epidemiologic results concerning the dynamics of acquisition and loss of HPV infection during the first four visits for the first 1425 women enrolled in the cohort.
Subjects and Methods
Study design and population
The Ludwig-McGill Cohort Study is an ongoing longitudinal investigation involving repeated measurements of lifestyle, nutritional, and behavioral risk factors as determinants of the acquisition of HPV infection and precursor lesions of cervical cancer. The design and methods of the investigation have been described elsewhere . Study participants attend a comprehensive maternal and child health program (Maternidade Escola Vila Nova Cachoeirinha) that serves low-income families in the city of São Paulo. Women were eligible to participate if they met the following criteria: (1) age between 18 and 60 years; (2) permanent residents of São Paulo (city); (3) not pregnant and not intending to become pregnant in the following 12 months; (4) having an intact uterus and no current referral for hysterectomy; (5) no use of vaginal medication in the previous 2 days; and (6) no treatment for or referral for treatment of cervical disease in the previous 6 months. In addition, women were considered ineligible if they were not interested in complying with all scheduled returns for at least the subsequent 2 years.
The cohort investigation began in November 1993 and accrued subjects until March 1997. Subjects will be followed for up to 5 years: every 4 months in the first year and twice yearly thereafter. To encourage compliance with follow-up returns, women were given meal tickets that increased by $5 (US) in cash value for each return visit subsequent to enrollment up to a maximum of $20 at the fourth visit. At each visit, subjects complete an interviewer-administered structured questionnaire and have cervical specimens taken for HPV DNA testing and Pap cytology. The present analysis includes 1425 women whose cervical specimens from the enrollment visit and for up to three follow-up returns have been tested for HPV positivity.
An Accelon biosampler (Medscand, Hollywood, FL) was used to collect a sample of ectocervical and endocervical cells in a tube containing Tris-EDTA buffer, pH 7.4. DNA samples were purified by spin column chromatography. Cervical specimens were tested for the presence of HPV DNA by the MY09/11 polymerase chain reaction protocol [6, 15]. Typing of the amplified products was performed by hybridization with individual oligonucleotide probes specific for 27 genital HPV types. Amplified products that hybridize with the generic probe but with none of the type-specific probes were further tested by restriction fragment length polymorphism (RFLP) analysis , which extended the range of identifiable HPV types to >40 genital types. To verify the specificity of the hybridizations, we included more than 30 type-specific positive controls in all membranes. To check the integrity of the host DNA material extracted from the specimens, assays also included an additional set of primers (GH20 and PC04) to amplify a 268-bp region of the β-globin gene . All HPV assays were done blindly on coded specimens with no identification to link different specimens from the same woman. Appropriate precautions were taken to reduce the possibility of specimen contamination.
In some analyses, HPV types were grouped by oncogenicity. We used an expanded classification based on Bauer et al. , which grouped HPV types 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, and 68 as oncogenic (high risk) and considered all other HPV types as nononcogenic (low risk). Unknown types were included in the latter group.
We used Pearson's χ2 test to compare frequencies of selected characteristics between women with and without HPV test results during follow-up. Presumed incidence rates of infection with individual HPV types and their associated 95% confidence intervals (CIs) were calculated on the basis of the numbers of cases in which a given type was detected among women who were free of that type at the enrollment visit. Follow-up time was considered to be until the first visit in which the stated type was detected or until the last visit if the subject remained negative for that type throughout the observation period. We used the Kaplan-Meier method to calculate both the cumulative probability of HPV detection among women who were HPV negative at enrollment and the proportion of women remaining HPV positive at different times after enrollment. Missing values during follow-up because of lack of a specimen or a β-globin—negative result were treated as censored observations unless there was a subsequent informative specimen at a later visit, in which case the result for the latter was used. The statistical assessment of linear dose-response relations between ordinal variables and outcome was based on the χ2 test for trend.
We assessed the tendency for infection with certain HPV type combinations to occur more frequently than others by computing expected frequencies of joint cumulative positivity for all pairwise combinations under the assumption of no association between types. Expected values were compared against observed frequencies to screen for types occurring in association. We used Fisher's exact test to gauge statistically significant departures from the expected values among all type pairs. This was done solely to flag possible patterns in type associations and not as a means of formally assessing the significance of the individual associations. CIs for observed to expected (O/E) ratios were based on the Poisson distribution.
Characteristics of the cohort
The 1425 participants in this analysis represented mostly the subcohort of women admitted early in the study (the cohort enrolled 2528 women with a 70% response rate). In total, we tested 4873 cervical specimens from enrollment and follow-up visits (mean, 3.4 specimens/subject). The follow-up experience accumulated for this subcohort was 14,341 woman-months (mean, 10.1 months/participant; range, 3.6–36.7). Specimens from 1067 women (75% of the original subcohort) were available at the fourth visit. The median age was 33 years (mean, 33.3; first and third quartiles, 26 and 39 years, respectively). Only 12% of the subjects were ⩾45 years old. Most participants (66%) were white, and 83% had at most an elementary school education. Slightly more than 50% of the women reported ever smoking cigarettes; 35% were current smokers. Only 1% of the women never were pregnant; 85% reported at least two pregnancies. Only 36% of the women reported having had >2 lifetime sex partners.
There were 172 subjects for whom HPV results were not available at the second visit (12% of the original subcohort). These women were mostly comparable to the rest of the subcohort with respect to race (P = .536), age (P = .121), education (P = .204), age at first intercourse (P = .180), parity (P = .740), and duration of oral contraceptive pill use (P = .154). On the other hand, those who were not followed up tended to have slightly fewer lifetime sex partners (mean, 2.9) than those who had HPV results during follow-up (mean, 3.6); the difference was of marginal statistical significance (P = .075).
Cumulative HPV positivity
Table 1 shows the prevalence of HPV infection at enrollment and the cumulative HPV positivity for each HPV type and for unknown HPVs. Of the genital HPV types that can be specifically ascertained by typing, only HPV-64 was not detected. The overall prevalence of infection with any HPV type at enrollment was 13.8% (196 positive cases). The point prevalence of HPV infection at each subsequent follow-up visit varied little: 14.9%, 14.7%, and 12.3% for visits 2–4, respectively. The prevalence of multiple-type infections (2–4 types) was 2.2% at enrollment and ranged between 1.2% and 1.7% in subsequent visits (data not shown). A total of 357 subjects were positive at least once for HPV, either at the enrollment visit or subsequently, which yielded a crude (nonactuarial) cumulative rate of 25.1%. HPV-16 was the most common type cumulatively (n = 60, 4.2%), followed by HPV-53 (n =43, 3.0%), unknown types (n = 38, 2.7%), HPV-58 (n = 32, 2.2%), and HPV-31 (n = 28, 2.0%; table 1).
Table 1 also classifies HPV types on the basis of the tendency for being persistently detected during follow-up visits. The right-hand column in table 1 shows the ratio between the frequency of positivity in two or more visits and that in a single visit as indicative of the tendency for a given type to persist. The ratio for most HPV types was <1.0. The highest ratios were those for HPV-33, -40, and -61, although they were based on small individual frequencies. The ratio for HPV-16 was 0.9, indicating that an about equal number of subjects had positivity in two or more visits, compared with positivity only once.
Presumed incidence of HPV infection
Acquisition of infection by any HPV type was relatively common: 1.3%/month (95% CI, 1.1–1.6). Table 2 shows the presumed incidence rate of infection for the most common HPV types found in the study. The number of women at risk for each HPV type represents those who were negative for that type at enrollment. For instance, a woman who was positive for HPV-16 at entry would not be counted in the analysis of HPV-16 incidence but would contribute woman-months of follow-up for all other HPVs.
By grouping according to their oncogenicity of all 40 HPV types that have been identified thus far in the cohort (plus unknown HPVs), the combined prevalence at entry for non-oncogenic types was somewhat lower than for oncogenic types (7.0% vs. 8.4%, data not shown). However, the combined incidence for the latter (0.68%/month) was significantly lower than for nononcogenic types (0.91%/month), which indicates the tendency for oncogenic types to be more persistent, leading to longer duration of infection and consequently to a higher point prevalence. HPV-53 was the most frequent incident type (0.16%/month), followed by HPV-16 (0.14%/month). Unknown HPVs of ⩾ 1 types accounted for the highest number of new infections (0.19%/month).
Age-specific prevalence and incidence
Table 3 separately presents the initial prevalence and presumed incidence rates of infection during follow-up for the most common HPV types for women aged <35 and for those aged ⩾35 years. The prevalence and the incidence rates were generally higher for younger women for most of the common types, except for HPV-52 (prevalence) and HPV-70 (incidence). A positive relation with age was seen for both prevalence and incidence of unknown types. By considering grouped types, the negative relation with age was maintained for prevalent and incident oncogenic HPVs, whereas the incidence rates of nononcogenic types were comparable for the two age groups despite a higher prevalence for younger women.
Cumulative HPV positivity
Figure 1 shows the cumulative probability of HPV detection over time for infection of any type, for nononcogenic and oncogenic types, and for HPV-16 in isolation. Among women who were negative at enrollment, the cumulative rates of infection with any types were 7.3% (95% CI, 5.7–8.9), 13.6% (95% CI, 11.5–15.8), and 23.6% (95% CI, 17.2–30.0) at 6, 12, and 18 months, respectively. The cumulative probability of acquiring infection with a nononcogenic HPV type at 18 months (22.4%; 95% CI, 14.7–30.0) was twice that for an oncogenic type (11.2%; 95% CI, 6.8–15.7). The cumulative proportion testing positive for HPV-16 among women who were HPV-16 negative at enrollment was 1.5% at 12 months (95% CI, 0.7–2.2) and 4.0% (95% CI, 0.1–7.8) at 18 months.
Frequency of testing and presumed incidence
We tested the hypothesis that the frequency of return visits was influencing the incidence of HPV infection in the cohort. Although return visits had been prescheduled at 4, 8, and 12 months, a substantial proportion of women delayed their appointments. Once a woman returned, our nurses scheduled the next appointment for 4 months later, which in the end produced sufficient variation in time between visits to allow an analysis of HPV positivity as a function of the number of visits but with adjustment for the person-time experience of follow-up. Table 4 shows that the frequency of return visits did not influence either the presumed overall HPV incidence rate or the cumulative rate at 12 months (nonsignificant linear trend in both cases). This finding suggests that the detection rates for presumably newly acquired infections are not a function of the intensity of viral surveillance during follow-up insofar as the conditions of our study design are concerned.
Loss of initial positivity
HPV infection was mostly a transient phenomenon in the cohort. Of the 177 subjects who were positive for any type at enrollment and who had one or more follow-up visits with evaluable specimens, 61% (95% CI, 54–69) retained their original positivity at 6 months, but only 35% (95% CI, 27–42) did so after 12 months. Figure 2 shows the proportions of women remaining positive for specified type combinations and for HPV-16 as a function of time since enrollment. Loss of positivity by group was considered to have occurred at the first follow-up visit at which the subject was no longer positive for the type grouping detected at enrollment, even if the cervical specimen was positive for a viral type of a different grouping. The rate of loss of positivity was higher for nononcogenic types (12.2%/month; 95% CI, 9.6–15.4) than for oncogenic HPV infections (9.5%/month; 95% CI, 7.5–11.9). This resulted in median positivity retention times since enrollment (i.e., the time interval it took for 50% of the cases to have lost the original positivity) that were substantially greater for oncogenic types (8.1 months; 95% CI, 7.8–8.3) than for nononcogenic types (4.8 months; 95% CI, 3.9–5.6). The mean duration of infections (measured actuarially) detected at enrollment was 7.0 months (95% CI, 6.2–7.8) for nononcogenic and 8.9 months (95% CI, 7.6–10.2), for oncogenic types. Persistence for HPV-16 infections was comparable to that of other oncogenic types: 8.9%/month rate of loss (95% CI, 5.8–13.1), with a median positivity retention of 8.4 months (95% CI, 6.8–10.0).
Table 5 shows all pairwise frequency combinations of HPV types as detected cumulatively since enrollment. Each observed frequency of joint cumulative positivity is contrasted with the expected value under the assumption that no association existed between the 2 types being compared. Only the 13 most common types (those with ⩾1% cumulative positivity) were included in the analysis. Although unknown HPVs were a frequent finding, they were not included because they are detected by exclusion, which invalidates the analysis of joint frequencies with other types. In all, with 78 possible pairwise combinations, one would expect that for a few type pairs, the observed frequencies would be deemed significantly different from the expected ones by chance alone. Assuming a completely random distribution of the 13 types shown in table 5, one would expect to see about 4 associations exceeding the 5% significance level and only 1 exceeding the 1% level. However, there were 13 pairs at the 5% level, 4 of which had differences above the 1% level. In all instances, the observed value was greater than the expected. Five of these joint excess occurrences involved HPV-53, and, in another 4, HPV-16 was involved; these are the 2 most common types. These 2 types did not seem to be strongly associated (O/E ratio, 2.2; 95% CI, 0.6–5.7). Of interest, the highest frequency of joint positivity involving HPV-16 was with MM8 (O/E ratio, 6.3; 95% CI, 2.0–14.6). HPV-6/11 was only remarkably frequent in association with HPV-53 (O/E ratio, 5.0; 95% CI, 1.4–12.8).
Before we address the implications of this study's findings, it is important that we consider its limitations. First, cohort participants were not selected from a probabilistic sampling base from the general population of women at risk for cervical cancer. For logistic and practical reasons, we chose a clinic-based design that selected women who were already clients of a maternal and child health delivery system maintained by the municipality of São Paulo for low-income residents in the northern part of the city. Cohort members were, therefore, not representative of the area's general female population in the eligible age range for the study (18–60 years), who presumably may not have had comparable access to a public health maintenance program. One indication that this may have been so is that only 5% of our subjects reported never having had a Pap smear . On the other hand, our clinic-based sample was more appropriate for a study of the natural history of HPV infection and cervical cancer precursors because of the likely higher compliance with the requirement for multiple follow-up returns. The study was conducted in a country with a high cervical cancer incidence by international standards .
In a study of the natural history of cervical HPV infection, one would expect that estimates of incidence of infections by type and accrued positivity over time would be a function of the intensity of surveillance built into the study design. Subjects were told to return every 4 months during the first year and then every 6 months thereafter for cervical specimen collection. We expected that this relatively intensive regimen would yield artificially high rates of detection and cumulative positivity. Of interest, we found that the number of return visits during follow-up did not influence either the rate of detection of new infections or the cumulative positivity. This suggests that the dynamics of acquisition of infection and its duration are not prone to a surveillance bias related to determinants of compliance to scheduled visits, at least within the constraints of our cohort study design.
Another important caveat in a study of the natural history of HPV infection of women past their onset of sexual activity is that one cannot measure the true incidence of new infections among those negative at enrollment: it is impossible to distinguish a new infection from a recurrence or reactivation of a previously latent infection within the limitations of the molecular sensitivity of the HPV detection method. For this reason, we can only claim to have measured the “presumed” incidence rate of a new type for those negative at enrollment. Had we made our definition more stringent, by including only women with 2 negative HPV results at the enrollment and first follow-up visits in the computation of presumed incidence rates, the results might have been different.
Our cohort included relatively older women compared with some previous studies of the dynamics of HPV infection [7, 8, 19, 20]. Proportionally fewer women in our study had had multiple sex partners compared with other cohorts or cross-sectional surveys [15, 17, 21, 22]. Nevertheless, HPV infection was relatively frequent in the cohort: 38% of the women had ⩾ HPV-positive specimen after 18 months. HPV-16 was the most common type (19% of all positive cases at enrollment), not unlike most previous studies (reviewed by ). HPV-53 was the second most commonly detected type: 3% cumulative positivity (10% of all positive cases at enrollment), exceeding HPV-6/11, which is typically the second most common in North American studies  and which was the second most common in a previous study we conducted in northeast Brazil . Infection with HPV-58 was the third most frequent occurrence in the cohort (2.2% cumulative positivity). This type was highly prevalent in a comparable low-income, mixed ethnicity female population in Washington, DC . Unknown HPVs were a relatively common finding, but their occurrence is an underestimate because multiple infections with known and unknown types are impossible to distinguish in practice because unknown types are detected by exclusion (i.e., when the generic probe is positive in the absence of hybridization with any of the specific probes for individual types or no discernible RFLP pattern). if both the generic and a specific probe were positive (the majority of the HPV-positive specimens), then the result was counted toward that specific type.
The relation between incidence and prevalence is mediated by the average duration of HPV infection, an indicator of persistence, which is a key prognostic variable for risk of high-grade cervical lesions [10, 19]. Using actuarial methods to analyze persistence of infections present at enrollment, we found that positivity for oncogenic types was longer lasting than for nononcogenic types. It took almost twice as long for 50% of the oncogenic-type infections to clear (8.1 months) than for the nononcogenic types (4.8 months).
Calculating mean infection retention times actuarially does not provide a true measure of the average duration of infections because it is impossible to know how long a woman has been infected by the time she is found positive at enrollment. The resulting mean retention time is likely to underestimate the true mean duration of infection, with the extent of the bias being a function of how persistent a given type may be. One way to assess this discrepancy is by use of an alternative and unbiased measure of mean duration of infection that is based on the relation between prevalence and incidence estimates.
Assuming constant incidence and using the simple formula (derived from ): Duration = Prevalence/[Incidence × (1 − Prevalence)], we obtain 8.2 and 13.5 months as the mean durations for infections with nononcogenic and oncogenic types, respectively. Although both of these estimates are greater than the equivalent actuarial ones, the magnitude of the difference is more appreciable for oncogenic types (13.5 vs. 8.9 months) than for nononcogenic types (8.2 vs. 7.0 months), which underscores the extent of the bias as a function of persistence.
Age seemed to mediate the relation between prevalence and incidence when considering grouped types (i.e., oncogenic and nononcogenic HPVs). An inverse relation with age was seen with both prevalent and incident oncogenic HPVs, whereas the incidence of nononcogenic types did not vary by age, despite the higher prevalence among younger women. This finding suggests that age may not be a factor in determining persistence of oncogenic types, but younger women (<35 years) may be more likely to have longer-lasting infections with nononcogenic types, which would result in a higher prevalence than for older women, despite a comparable incidence as in older women. In fact, by using the above formula separately for the 2 age groups (<35 and ⩾35 years), we observe that oncogenic-type infections have comparable mean durations by age: 13.4 months for women aged <35 years and 14.2 months for those aged ⩾35 years, whereas younger women have nononcogenic-type infections that last about twice as long (10.2 months) as those in older women (5.6 months). This interaction effect between HPV type and age on persistence needs to be corroborated by further studies before any firm conclusions can be drawn.
We investigated whether the risk of acquiring multiple infection by specific HPV types was random in the cohort. We anticipated that joint infections would be more common than is expected by chance because of the shared risk determinants among types (e.g., sexual activity, age). Our goal was to identify distribution clusters that could reveal a tendency for certain types to occur more frequently as joint infections, thus providing clues as to the existence of separate transmission patterns. Although our search may have been affected by low statistical power because of the relative rarity of individual-type infections, we found 13 pairs of types with significantly higher than expected joint occurrence. HPV-16 and HPV-53 were frequently found in association with other types but were not remarkably common with each other. Some types were more frequently found in isolation (e.g., HPV-61 and HPV-70) with a tendency for lower than expected joint occurrences. We caution against drawing any inferences concerning possible commonality of transmission factors, however, because these distribution inequalities could be due to chance or to low statistical power. On the other hand, it is clear that HPV types tend to occur as joint infections far more frequently than one would expect. As our cohort increases in size and follow-up is extended, such patterns could reveal important clues about determinants of transmission for individual HPV types.
Understanding the epidemiology of cervical HPV infection is an important first step toward the development of strategies for preventing genital HPV disease and, ultimately, cervical cancer. Longitudinal studies, such as the present one, are ongoing in different geographic areas and may provide critical information to assist in the implementation of improved screening programs and in the delivery of future immunization programs aiming at preventing HPV infection and cervical neoplasia.
We are indebted to Maria L. Baggio and Lenice Galan for patient management and specimen collection, to Antonio L. Ruiz and Marcella P. Ribeiro for laboratory assistance, and to Silvaneide Ferreira for computer data entry.