Predicting global variation in infectious disease severity

A hundred years ago, infectious diseases disappeared over few decades as major causes of mortality in several Northern European countries. Heart-attack, stroke and cancer then became the main causes of deaths. Why and how this occurred remains unknown, but just possibly it was the natural consequence of demographic change.


the virulence
of human-to-human transmitted diseases.

INTRODUCTION

Infectious diseases remain one of the most important causes of morbidity and mortality in many parts of the world [1,2].In addition to disease specific and age adjusted mortality and morbidity rates, epidemiological measures of disease 'severity' include case-fatality-ratio (CFR= mortality rate/incidence rate) or time-to-death (also equal to survival rate, SR = mortality rate/prevalence rate) [3].It is the variation in these measures that allows internation l comparisons of the impact of infectious diseases.

Scholars studying epidemics in the beginning of the 20th century proposed that CFR and SR followed original research article specific patterns where CFRs typically were markedly higher in the early phase of epidemics for a number of diseases [4], and that the epidemics lost momentum as CFR declined.Based on this observation they hypothesized that serial passage was the primary cause of epidemics coming to an end.This view was contested by others who argued that this observation reflected the interaction between susceptible, infected and immune segments of the population [5].More recently evidence has emerged which would lead us to reconsider the possible importance of the serial passage hypothesis.This includes, CFR variation for measles over the 20th century [6] and the vaccine driven eradication of small pox [7], which both suggest that CFR declined prior to the 'disappearance' of a disease.We restated the century old hypothesis i.e. that a substantial proportion of the variation in CFR could be explained by erial passage and its effect on virulence [5].

Here, we hypothesize first that serial passage through children selects for pathogen virulence.Second the probability of achieving continued serial passage through children is likely to decline during demographic transition because of declining birth rates.[8] A reduction in death from infectious diseases would e a natural consequence [9][10][11][12][13].

The aim of the present study was to examine whether country-specific CFRs for a number of infectious human diseases were predictable from vital statistics.This was examined under the assumption that patterns of CFR is mainly driven by bottom up processes i.e. that birth rates for some infectious diseases, can be use

to predi
t disease specific CFR variation.


METHODS

We explored this hypothesis in a two stage process.Firstly we used contemporary data to explore the influence of population attributes such as age, nutritional status, birth rates and vaccination status.We then tested the predictions from this analysis by an analysis of historical data from De mark for mumps, malaria and tuberculosis.


Contemporary data records on CFR and SR

The first step required that records of CFR could be gathered over a wide range of birth rates.The records collected and managed by the WHO, which covers most nations in the world and therefore also populations at widely different stages of demographic development, appeared to be the logical choice as source of information.We explored the data on infectious diseases in the Global Health Observatory of the World Health Organization [14], the Data Presentation System connected to the WHO Mortality Database [15] and also performed our own extracts from the WHO Mortality Database [16] when the data was unavailable from WHO [15,16] and GHO [14].Codes (ICD-10) for disease identification and database extractions were retrieved from WHO [17].Finally information on vaccinati n coverage was retrieved from WHO [18].

Choosing these WHO databases as source of information carries certain drawbacks.For example, the quality of the extracted data cannot be assumed to be very accurate, rather assessment of mortality statistics by Mathers et al. [19] showed that there were only 23 countries with data that were more than 90% complete, i.e.where poorly defined causes of death accounted for <10% of total causes of death.Adding to this, misclassification of causes of death appear frequent in countries such as Sri Lanka and Mexico [20,21], and can also be quite considerable in developed countries [22].It should also be accepted that records of morbidity are even more uncertain, because poor access to medical services negatively affects recording of relatively benign infections.Hence the calculated CFRs are associ ted with considerable uncertainty.

We decided to include only diseases that provided a minimum of 30 records of observations and screened the available databases until we found a set of diseases which included at least two diseases that we believed could be modeled (i.e.their epidemiology changed markedly during the early 20th century epidemiological transition) and two diseases that we did not believe could be modeled (they have less clear or no particular association with the epidemiological transition).As result of this pr cedure we here evaluate CFR for:

Mumps (caused by the Mumps virus), which like measles is a paramyxovirus that is only directly transmitted through contact between humans [23] due to its very limited survival in the environment (<1 h, [24]).It would thus belong to the viruses that before the epidemiological transition were important for human mortality and which later were considered a minor threat to human health.While being preventable by vaccination, mumps out-breaks remain freque t in developed countries [23].

Malaria (caused by protozoan Plasmodium sp.) is a vector borne disease transmitted between humans by mosquitos belonging to the genus Anopheles [25].While currently being considered a tropical disease it was also a disease that was widespread in northern Europe in the 18th and 19th century [26][27][28][29] where it disappeared during the early 20th century.No vaccination is available and mosquito control remains an important tool to limit the number of malaria cases in endemic regions [25].

Tuberculosis (caused by the bacteria Mycobacterium tuberculosis) caused high mortality in Europe during the 17th to 19th century [30].It fundamentally differs from mumps and malaria however, both by remaining an important disease in most countries and by being carried by a great majority of healthy carriers [31], where only $10% will develop active disease as young adults [32].More than any other disease, tuberculosis has been associated with poverty and a poor standard of living [33,34].Also M. tuberculosis can survive for more than 100 days in the environment [24], which means that its survival is not directly tied to the survival of the host.It is noteworthy that a number of socioeconomic factors are important for tuberculosis risk, e.g. the country of birth will remain a significant risk factor for more than 20 years after immigration to a low risk country [35], which suggests that early life conditions play a role in the epi emiology of tuberculosis.

Leptospirosis (caused by pathogenic bacteria belonging to the genus Leptospira).The bacteria have their reservoir in a wide range of mammals and are not transmitted among humans.The infection is mostly mild, but certain types (serovars) e.g.carried by rats are more likely to caus severe diseases [36].

The biology, incompleteness of the records and use of multiple recoding systems raised a number of issues, which had to be addressed prior to statistical analysis.These included: (i) infectious diseases often occur in epidemics, which in given years may occur in one country but not in others.This means that information covering multiple years is required to establish comparable mean incidence rates.(ii) Many countries will in some years not report any cases, while they do so in other years.When calculating the incidence rate it must be assumed that the lack of information is either zero or absent.The choice will affect the calculation when a mean incidence is drawn over several years.(iii) The databases provide case numbers for whole countries but it is not always possible to get information on the size of the population that is exposed.If we simply assume that the entire population is exposed, when it is not, then a biased measure of incidence rate is provided.This obstacle is relevant both when a parasite is only found in parts of the country and when parts of the population are vaccinated.(iv) When using data across recording systems (e.g. both morbidity and mortality) within nations it has to be assumed that they are based on comparable diagnostic criteria, providing equally precise estimates of abundance.If this is indeed true then the pairing of data is likely to be more reliable because 'often we can greatly increase the precision by making comparisons within matched pairs of experimental material' [37].As a result of these considerations, average values were, when possible, drawn over longer periods for all variables to establish mean morbidity and mortality (Table 1).In calculating mortality rates from WHO [16], absent values were accepted as zero values, because it is typically the countries with few cases that in some years are not reported.The mean annual incidence and mortality were calculated as the total number of recorded cases divided by the range of years included.Population numbers for calculation of incidence and mortality rates originated from GHO [14], because WHO [16] had low coverage for rece

years.


Independent
ariables

A number of population attributes were retrieved from the GHO [14]: crude birth rate (CBR, births/ population size), death rate (DR), median age of the population (MA), mean body mass index (BMI), proportion living in urban areas (PROPU) and from WHO [18], tuberculosis vaccine coverage (VAC) (Table 2).Each of the chosen variables refers to specific attributes of human populations which previously have been hypothesized to be associated with changes in CFR or pathogen virulence.DRs affect virulence because shorter host longevity favors those parasites that leave the host earlier, i.e. are transmitted more quickly [5,11,12].Median age of the population represent the possible effect of risk of mortality varying with age [38][39][40] and BMI the possible effect of risk of mortality varying with nutritional status [41].Finally, the proportion of people living in urban areas (PROPU) should allow the observation of density dependence, while the proportion of vaccinated individuals (VAC) should capture the effect of protective immunity and herd effects [5].Certainly other hypotheses and variables are relevant, but since several of these are not easily considered e.g. the possible interaction between infectious diseases [42], then we accepted that these five alternative variables represented an acceptable statistical challenge to the serial passage hypothes s.

CBR was log e -transformed to allow for diminishing returns.DR was calculated as 1/life expectancy at birth Â 100, which compared to the Crude Death Rate (CDR, deaths/population size) is less influenced by population age structure than actual crude DRs.The calculated DR returned mortality rates that scaled in the same numerical interval as log e (CBR) i.e. from 1 to 4, and thus model estimates should be directly comparable.PROPU was here accepted as a measure of population density, which would be comparable across nations without any transformation, and thus the remaining variables were included untransformed.
Cross correlation analysis for examination of potential confounding was performed under PROC CORR (SAS 9.3, SAS Institute) on independent variables and correlation between CFR, morbidity and mortality rates were performed on log e -transformed values under PROC CORR.Finally, CFRs were analysed under the PROC GENMOD procedure (distribution = binomial, link= logit) using the DSCALE option to retrieve conservative estimates for P values.Here, mean annual morbidity and mortality were rounded to natural numbers to  Values represent means over the specified period for the given number of countries (N)

comply with the underlying assumption of the statistical procedure (PROC GENMOD, SAS 9.3, SAS institute) and P values <0.01 was accepted as relevant for predictive purposes.We did not include interactions between the six independent variables, because these would have produced out-puts which would have no clear biological meaning e.g. the interaction between urban proportion and vaccination.Neither did we reduce the statistical models because the cross correlation analysis indicated that the independent variables were highly correlated (All 15 tests had P's < 0.001, Table 3).Hence, we only identified significant effects when these were observed in models adjusted for the effects of other relevant variables.Multiple evaluations with various combinations of the independent variables were executed, leading to many different outcomes.While considerable differences were noted among models, it was also noted that differences between diseases remained much the same and the associated conclusions similar.The following presentation included the model with the greatest number of variables, which were deemed more realistic than the models with fewer independent variables.Given that the analysis of global records was less than comprehensive we checked the validity of the main predictors that were identified.This was done by examining the correlation between disease occurrence [43][44][45] and vital statistics in Denmark [46].


RESULTS

Mumps was observed to have a highly variable CFR ranging from nearly 10 À 6 to nearly 1 (Fig. 1a) and the CFR was significantly correlated with both incidence and mortality rate [Pearson's correlation coefficient (PCC) for n

34: PCC
À0.85,P = 0.001 and PCC = 0.69, P = 0.001, respectively], suggesting that both measures were contributing to the variability in Mumps CFR.Malaria CFR ranged from $10 À 3 to nearly 1 (Fig. 1b) and the CFR was only correlated with mortality rates (PCC for n = 68: PCC = 0.01, P = 0.94 and PCC = 0.66, P = 0.001, for morbidity and mortality rates, respectively).The range of tuberculosis CFR was quite narrow (Fig. 1c) but had significant correlation with mortality rates (PCC for n = 78: PCC = 0.15, P = 0.20 and PCC = 0.64, P = 0.001, for morbidity and mortality rates, respectively).Lastly, leptospirosis CFR ranged from $10 À 5 to 10 À 2 , which correlated with both morbidity and mortality rates (PCC for n = 48: PCC = À0.32,P = 0.03 and PCC = 0.59, P = 0.001, for morbidity and mortality rates, respectively).Overall, the four diseases appeared to have very different epidemiological patterns.

Birth rate (CBR) and DR were significantly correlated with CFR for several diseases (Table 4), and mumps and malaria was clearly differed from tuberculosis and leptospirosis by having positive effects of CBR and sizable negative intercepts.I DISCUSSION

The observed correlations might be uncertain because the underlying reporting systems are biased and provide imperfect or flawed indications of the specific epidemiological pattern for the given diseases e.g.underreporting of mild

cases.Stil
it seems unlikely such shortcomings could account for the major differences in CFR ranges, because this requires, not only that a few observations are flawed, but that they all are.Similar arguments are valid for the generalized linear model for CFR, which both due to confounding factors and an absence of relevant variables could be misleading as to the importance of the independent variables.While suspicions of poor accuracy and biases are easily imagined, they are also difficult to prove, and the main method of assessing the validity of the results is therefore to assess the predictive value of the models.Since the main predictor for changes in CFR was clearly identified in all diseases (Table 4) and correlations with incidence rates were characterized, the simplest method of evaluation was to assess these results for consistency with historical records of the disease and its predictor.

For mumps it would appear that CBR predicts variation in CFR (Table 1) and that CFR corresponds to changes in the incidence rate (Fig. 1a).It therefore follows that incidence rates would be negatively correlated with birth rates.This appears to be onsistent with Danish crude incidence rates for mumps from 1901 to 1960, which in general increased in the period (Fig. 1d).Mumps thus appears to be a highly predictable disease, which is tightly associated with variation in birth rates.It should be noted that CFR variation as observed in mumps is often accepted as a measure of virulence variation, but in the present case we cannot argue that the variation in CFR only depends on parasite characteristics since age-distributions and associated differences in average immune-competence across populations [38][39][40], are confounded with differences in CBR.However, even though differences in age structure across populations were a substantial contributor to variation in CFR, it would not necessarily explain all the observed variation (Fig. 1a).The comprehensive review for measles CFR variation given by Wolfson et al. [47] indicates no clear variation in CFR with age, while a few studies that evaluated children under the age of 5 years showed a decreasing trend with age.The data for measles indicate a 2-to 3-fold difference among age-groups which-if paralleled in mumps-would mean that perhaps half the observed variation is attributable to differences in age-structure, while the remainder is associated with differences in mumps virus virulence.We would therefore tentatively accept that 'virulence' also changed according to birth rate variation [11] and other factors, such as indicated in Table 4.

The correlation between CFR and incidence rates for malaria (Fig. 1b) was less clear than for mumps (Fig. 1a), presumably because malaria is caused by several species and is transmitted by a number of different vectors across the world [25].The difficulties in documenting a correlation between CFR and incidence rates also aris from the fact that not all people in the given countries are exposed and hence that the incidence measure is severely biased in some countries.Albeit, even a cautious interpretation would lead to the conclusion that mumps and malaria differ fundamentally in their relationship between CFR and incidence rate, and comparing the temporal development in Denmark for mumps and malaria in Denmark from 1901 to 1960 (Fig. 1d and  e) would also suggest that the underlying principles are dissimilar-in spite of the similar results in the analysis of CFR (Table 4).The more diverse biology of malaria and inaccurate incidence rate is, however, not the only feasible explanation for the lack of clear correlation between CFR and incidence rate (Fig. 1b), since the lack of clear correlation may be the unavoidable by-product of a limited range in CFR in malaria.Neither do the diverging patterns of mumps and malaria over the first half of the 20th century necessarily mean that the underlying drivers of CFR's differ (Fig. 1d and e).It should be noted that malaria has high lower-values in the CFR range (Fig. 1b), and that this could be indicative of a lower virulence threshold for disease maintenance, i.e. mumps can through simple contact be transmitted as it does not require any particular dose or level of virulence, while malaria at low inoculation dose in humans [48] or due to other constraints would fail to infect humans.This would be consistent with the variation in global malaria incidence rates, which are lower than 250 cases per 100 000 inhabitants where national birth rates are less than 25 births per 1000 inhabitants, and typically absent where birth rates are <20.A lower threshold would account for the disappearance of malaria in Denmark in 1911-41 and 1945 to the present.Finally, transmission constraints and a threshold in malaria transmission would explain why the two humanto-human transmitted diseases have the same dependency on birth rates in CFR (Table 4), but different resulting occurrence: Mumps has negative correlation, while we suspect that malaria within its limited distribution has a positive correlation (Fig. 1b and d, and ignoring the eight highest CFR values in 1d).A number of authors have also argued that demographic change caused the natural eradication of malaria in European countries under temperatures that could sustain malaria transmission and thus this suggestion has been reached by more conventional epidemiological studies [26][27][28][29].

Several of these authors emphasized that demographic change-and not variation in mosquito abundance-were better predictors of the natural eradication of malaria in northern Europe, because it also allowed for the historical occurrence of malaria north of the Arctic Circle, which is not encompassed by conventional ideas on malaria epidemi logy.The epidemiological pattern in tuberculosis differed from mumps and malaria in all aspects that we would expect from the differences in their biology.Interpreting these results requires more caution than for mumps and malaria because for tuberculosis there is a less strict temporal correlation between infection and death.Still, a correlation with CDR is not surprising since historically tuberculosis has been associated with periods of poor living conditions and appeared to recede gradually in Europe in the early part of the 20th century as living conditions improved [33,34].Here, we cautiously note that indices of tuberculosis CFR in Denmark (calculated as no. of deaths/number of treated) tracks mortality rate (with a 20 years delay, Fig. 1f), such as it would be inferred from the global analysis and age-related risk [32].Displacing the data 20 years for this disease, would inspire the idea that not only the place of birth [35] but also the time of birth is a significant contributor to the risk of tuberculosis.Importantly, vaccination was not a major contributor to this early development since systematic vaccination in Denmark began in the 1940's [49], which would affect the number of recorded cases a few decades later [32].

Because Leptospirosis was mainly characterized by showing relatively high positive association with population density (PROPU), probably due to differences in Leptospira reservoirs in urban and rural settings [36], then it appeared that the main predictors for all four diseases fell within particular hypotheses.This conclusion may be difficult to accept since we strongly suspect that the underlying data are unreliable, but we argue that we can place some trust in the results because (i) the analysis incorporate paired-record analysis, that minimize disease specific bias and 'passively' remove national records that are not systematically monitoring a given disease, i.e. providing records on both morbidity and mortality and (ii) the effect-size is expanded to such scale (e.g. 10 À 6 to 1) that even a 5-fold bias in the underlying records, is of little consequence.


CONCLUSIONS AND IMPLICATIONS

We propose that the idea of serial passage and CFR variation should be reconsidered because demographic changes appeared to accurately predict CFR variation in mumps and malaria in a manner consistent with the hypothesis.Since infectious disease diversity is also strongly correlated with human fertility then it co

d be argued, that this applie
to many diseases [50] which further qualifies the hypothesis.This hypothesis would also in very simple terms explain why Europeans in just a f