-
PDF
- Split View
-
Views
-
Cite
Cite
Hagai Levine, Niels Jørgensen, Anderson Martino-Andrade, Jaime Mendiola, Dan Weksler-Derri, Irina Mindlis, Rachel Pinotti, Shanna H Swan, Temporal trends in sperm count: a systematic review and meta-regression analysis, Human Reproduction Update, Volume 23, Issue 6, November-December 2017, Pages 646–659, https://doi.org/10.1093/humupd/dmx022
- Share Icon Share
Abstract
Reported declines in sperm counts remain controversial today and recent trends are unknown. A definitive meta-analysis is critical given the predictive value of sperm count for fertility, morbidity and mortality.
To provide a systematic review and meta-regression analysis of recent trends in sperm counts as measured by sperm concentration (SC) and total sperm count (TSC), and their modification by fertility and geographic group.
PubMed/MEDLINE and EMBASE were searched for English language studies of human SC published in 1981–2013. Following a predefined protocol 7518 abstracts were screened and 2510 full articles reporting primary data on SC were reviewed. A total of 244 estimates of SC and TSC from 185 studies of 42 935 men who provided semen samples in 1973–2011 were extracted for meta-regression analysis, as well as information on years of sample collection and covariates [fertility group (‘Unselected by fertility’ versus ‘Fertile’), geographic group (‘Western’, including North America, Europe Australia and New Zealand versus ‘Other’, including South America, Asia and Africa), age, ejaculation abstinence time, semen collection method, method of measuring SC and semen volume, exclusion criteria and indicators of completeness of covariate data]. The slopes of SC and TSC were estimated as functions of sample collection year using both simple linear regression and weighted meta-regression models and the latter were adjusted for pre-determined covariates and modification by fertility and geographic group. Assumptions were examined using multiple sensitivity analyses and nonlinear models.
SC declined significantly between 1973 and 2011 (slope in unadjusted simple regression models −0.70 million/ml/year; 95% CI: −0.72 to −0.69; P < 0.001; slope in adjusted meta-regression models = −0.64; −1.06 to −0.22; P = 0.003). The slopes in the meta-regression model were modified by fertility (P for interaction = 0.064) and geographic group (P for interaction = 0.027). There was a significant decline in SC between 1973 and 2011 among Unselected Western (−1.38; −2.02 to −0.74; P < 0.001) and among Fertile Western (−0.68; −1.31 to −0.05; P = 0.033), while no significant trends were seen among Unselected Other and Fertile Other. Among Unselected Western studies, the mean SC declined, on average, 1.4% per year with an overall decline of 52.4% between 1973 and 2011. Trends for TSC and SC were similar, with a steep decline among Unselected Western (−5.33 million/year, −7.56 to −3.11; P < 0.001), corresponding to an average decline in mean TSC of 1.6% per year and overall decline of 59.3%. Results changed minimally in multiple sensitivity analyses, and there was no statistical support for the use of a nonlinear model. In a model restricted to data post-1995, the slope both for SC and TSC among Unselected Western was similar to that for the entire period (−2.06 million/ml, −3.38 to −0.74; P = 0.004 and −8.12 million, −13.73 to −2.51, P = 0.006, respectively).
This comprehensive meta-regression analysis reports a significant decline in sperm counts (as measured by SC and TSC) between 1973 and 2011, driven by a 50–60% decline among men unselected by fertility from North America, Europe, Australia and New Zealand. Because of the significant public health implications of these results, research on the causes of this continuing decline is urgently needed.
Introduction
Have sperm counts declined? This question remains as controversial today as in 1992 when Carlsen et al. (1992) wrote that: ‘There has been a genuine decline in semen quality over the past 50 years’. This controversy has continued unabated both because of the importance of the question and limitations in studies that have attempted to address it (Swan et al., 2000; Safe, 2013; Te Velde and Bonde, 2013).
Sperm count is of considerable public health importance for several reasons. First, sperm count is closely linked to male fecundity and is a crucial component of semen analysis, the first step to identify male factor infertility (World Health Organization, 2010; Wang and Swerdloff, 2014). The economic and societal burden of male infertility is high and increasing (Winters and Walsh, 2014; Hauser et al., 2015; Skakkebaek et al., 2016). Second, reduced sperm count predicts increased all-cause mortality and morbidity (Jensen et al., 2009; Eisenberg et al., 2014b, 2016). Third, reduced sperm count is associated with cryptorchidism, hypospadias and testicular cancer, suggesting a shared prenatal etiology (Skakkebaek et al., 2016). Fourth, sperm count and other semen parameters have been plausibly associated with multiple environmental influences, including endocrine disrupting chemicals (Bloom et al., 2015; Gore et al., 2015), pesticides (Chiu et al., 2016), heat (Zhang et al., 2015) and lifestyle factors, including diet (Afeiche et al., 2013; Jensen et al., 2013), stress (Gollenberg et al., 2010; Nordkap et al., 2016), smoking (Sharma et al., 2016) and BMI (Sermondade et al., 2013; Eisenberg et al., 2014a). Therefore, sperm count may sensitively reflect the impacts of the modern environment on male health throughout the life course (Nordkap et al., 2012).
Given this background, we conducted a rigorous and complete systematic review and meta-regression analysis of recent trends in sperm count as measured by sperm concentration (SC) and total sperm count (TSC), and their modification by fertility and geographic group.
Methods
This systematic review and meta-regression analysis was conducted and the results reported in accordance with MOOSE (Meta-analysis in Observational Studies in Epidemiology) (Stroup et al., 2000) and PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analysis) guidelines (Liberati et al., 2009; Moher et al., 2009) [checklists available upon request—contact corresponding author for access]. Our research team included epidemiologists, andrologists and a qualified medical librarian, with consultation with an expert in meta-analysis. Our predefined protocol, detailed in Supplementary Information, was developed following best practices (Borenstein et al., 2009; Higgins and Green, 2011; Program NT, 2015), and informed by two pilot studies, the first using all 1996 publications and the second all 1981 and 2013 publications.
Systematic review
The goal of the search was to identify all articles that reported primary data on human sperm count. We searched MEDLINE on November 21, 2014 and Embase (Excerpta Medica database) on December 10, 2014 for peer-reviewed, English-language publications. Following the recommendation of the Cochrane Handbook for Systematic Reviews, we searched in title and abstract for both index (MeSH) terms and keywords and filtered out animal-only studies. We used the MeSH term ‘sperm count’, which includes seven additional terms, and to increase sensitivity we added 13 related keywords (e.g. ‘sperm density’ and ‘sperm concentration’). We included all publications between January 1, 1981 (the first full year after the term ‘Sperm Count’ was added to MEDLINE as a MeSH term) and December 31, 2013 (the last full year at the time we began our MEDLINE search).
All studies that reported primary data on human SC were considered eligible for abstract screening. We evaluated the eligibility of all subgroups within a study. For example, in a case-control study, the control group might have been eligible for inclusion even though, based on our exclusion criteria, the case group was not.
We divided eligible studies into two fertility-defined groups: men unselected by fertility status, hereafter ‘Unselected’ (e.g. young men unlikely to be aware of their fertility such as young men screened for military service or college students); and fertile men, hereafter ‘Fertile’ (e.g. men who were known to have conceived a pregnancy, such as fathers or partners of pregnant women regardless of pregnancy outcome).
A study was excluded if study participants were selected based on: infertility or sub-fertility; range of semen parameters (e.g. studies selecting normospermic men); genital abnormalities, other diseases or medication. We also excluded studies limited to men with exposures that may affect fertility such as occupational exposure, post-intervention or smoking. Studies of candidates for vasectomy or semen donation were included only if semen quality was not a criterion for men's study participation. Studies with fewer than 10 men and those that used non-standard methods to collect or count sperm (e.g. methods other than masturbation for collection, or methods other than hemocytometer for counting) were also excluded.
First, based on the title and abstract the publication was either excluded or advanced to full text screening. Any publication without an abstract was automatically referred for full text screening. Second, we reviewed the full text and assigned it to exclusion within a specific category, or data extraction. We then confirmed study eligibility and identified multiple publications from the same study to ensure that estimates from the same population were not used more than once.
Data extraction
We extracted summary statistics on SC and TSC (mean, SD, SE, minimum, maximum, median, geometric mean and percentiles), mean or additional data on semen volume, sample size (for SC and for TSC), sample collection years and covariates: fertility group, country, age, ejaculation abstinence time, methods of semen collection, methods of assessing of SC and semen volume, selection of population and study exclusion criteria as well as number of samples per man. The range of permissible values, both for categorical and numerical variables, and information on data completeness were recorded. Data were extracted on all eligible subgroups separately as well as for the total population, if relevant. We attempted to extract data on additional potential confounders such as BMI, smoking and other lifestyle factors (e.g. alcohol and stress). However, except for smoking (which was examined in sensitivity analysis), data were available for such variables in only a minority of studies so these were not included in meta-regression analyses.
Quality control
The study was conducted following a predefined protocol (Supplementary Information). Screening for this extensive systematic review was conducted by a team of eight reviewers (H.L., N.J., A.M.A., J.M., D.W.D., I.M., J.D.M., S.H.S.). The screening protocol was piloted by screening of 50 abstracts by all reviewers followed by a comparison of results, resolution of any inconsistencies and clarification of the protocol as needed. The same quality control process was followed for full text screening (35 studies reviewed by all reviewers) and data extraction (data extracted from three studies by all reviewers). All data were entered into digital spreadsheets with explicit permissible values (no open-ended entries) to increase consistency. After data extraction, an additional round of data editing and quality control of all studies was conducted by H.L. The process ensured that each study was evaluated by at least two different reviewers.
Statistical analysis
We used point estimates of mean SC or mean TSC from individual studies to model time trends during the study period, as measured by slope of SC or TSC per calendar year. The midpoint of the sample collection period was the independent variable in all analyses. Units were million/ml for SC and million for TSC (defined as SC × sample volume) and all slopes denote unit change per calendar year.
We first used simple linear regression models to estimate SC and TSC as functions of year of sample collection, with each study weighted by sample size. We then used random-effects meta-regression to model both SC and TSC as linear functions of time, weighting studies by the SE. In all meta-regression analyses, we included indicator variables to denote studies with more than one SC estimate. We controlled for a pre-determined set of potential confounders: fertility group, geographic group, age, abstinence time, whether semen collection and counting methods were reported, number of samples per man and indicators for exclusion criteria (Supplementary Table S1).
For several key variables missing values were estimated and a variable was included in meta-regression analyses to denote that the value had been estimated. For example, for studies that reported median (not mean) SC or TSC, we estimated the mean by adding the average difference between the mean and median in studies for which both were reported. For studies that did not report the range or midpoint year of sample collection, the midpoint was estimated by subtracting the average difference between year of publication and midpoint year of sample collection in studies for which both were reported from publication year. When SD but not SE of SC or TSC were reported, the SE was calculated by dividing the SD by the square root of sample size for each estimate. For studies that did not report SD or SE, we estimated SE by dividing the mean SD of studies that reported SD by the square root of sample size for this estimate. If mean TSC was not reported it was calculated by multiplying mean SC by mean semen volume (Supplementary Information).
Our final analyses included two groups of countries. One group (referred to here as ‘Western’) includes studies from North America, Europe, Australia and New Zealand. The second (‘Other’/‘Non-Western’) includes studies from all other countries (from South America, Asia and Africa). We initially examined studies from North America separately from Europe/Australia but combined these because trends were similar and only 16% of estimates were from North America. We assessed modification of slope by fertility group (Unselected versus Fertile) and geographic group (Western versus Other). Because of significant modification by fertility and geography, results of models with interaction terms are presented for four categories: Unselected Western; Fertile Western; Unselected Other; and Fertile Other. Overall percentage declines were calculated by estimating the sperm count (SC or TSC) in the first and last year of data collection, and dividing the difference by the estimate in the first year. The percentage decline per year was calculated by dividing the overall percentage declines by the number of years.
We ran all analyses for TSC weighting by SE of TSC and adjusted for method used to assess semen volume: weighing, read from pipette, read from tube or other.
We conducted several sensitivity analyses; adding cubic and quadratic terms for year of sample collection in meta-regression analyses to assess non-linearity; excluding a specific group for each covariate, such as a group with incomplete information; removing covariates one at a time from the model; removing studies with SEs > 20 million/ml; replacing age group by mean age, excluding studies that did not report mean age; adding covariate for high smoking prevalence (>30%); excluding countries that contributed the greatest number of estimates in order to examine the influence of these countries; restricting analyses to studies with data collected after 1985 and after 1995 to examine recent trends.
All analyses were conducted using STATA version 14.1 (StataCorp, TX, USA). A value of P < 0.05 was considered significant for main effect and P< 0.10 for interaction.
Results
Systematic review and summary statistics
Using PubMed and Embase searches we identified 7518 publications meeting our criteria for abstract screening (Fig. 1). Of these, 14 duplicate records were removed and 4994 were excluded based on title or abstract screening. Full texts of the remaining 2510 articles were reviewed for eligibility and 2179 studies were excluded. Of the remaining 331 articles, 146 were excluded during data extraction and the second round of full text screening (mainly due to multiple publications). The meta-regression analysis is based on the remaining 185 studies, which included 244 unique mean SC estimates based on samples collected between 1973 and 2011 from 42 935 men. Data were available from 6 continents and 50 countries. The mean SC was 81 million/ml, the mean TSC was 260 million and the mean year of data sample collection was 1995. Of the 244 estimates, 110 (45%) were Unselected Western, 65 (27%) Fertile Western, 30 (12%) Unselected Other and 39 (16%) Fertile Other. Data from the 185 publications included in the meta-analysis are available upon request—contact corresponding author for access (Abyholm, 1981; Fariss et al., 1981; Leto and Frensilli, 1981; Wyrobek et al., 1981a,b; Aitken et al., 1982; Nieschlag et al., 1982; Obwaka et al., 1982; Albertsen et al., 1983; Fowler and Mariano, 1983; Sultan Sheriff, 1983; Wickings et al., 1983; Asch et al., 1984; de Castro and Mastrorocco, 1984; Fredricsson and Sennerstam, 1984; Freischem et al., 1984; Ward et al., 1984; Ayers et al., 1985; Heussner et al., 1985; Rosenberg et al., 1985; Aribarg et al., 1986; Comhaire et al., 1987; Kirei, 1987; Giblin et al., 1988; Kjaergaard et al., 1988; Mieusset et al., 1988, 1995; Jockenhovel et al., 1989; Sobowale and Akiwumi, 1989; Svanborg et al., 1989; Zhong et al., 1990; Culasso et al., 1991; Dunphy et al., 1991; Gottlieb et al., 1991; Nnatu et al., 1991; Pangkahila, 1991; Weidner et al., 1991; Levine et al., 1992; Sheriff and Legnain, 1992; Ali et al., 1993; Arce et al., 1993; Bartoov et al., 1993; Fedder et al., 1993; Noack-Fuller et al., 1993; World Health Organization, 1993; Hill et al., 1994; Rehan, 1994; Rendon et al., 1994; Taneja et al., 1994; Vanhoorne et al., 1994; Auger et al., 1995; Cottell and Harrison, 1995; Figa-Talamanca et al., 1996; Fisch et al., 1996; Irvine et al., 1996; Van Waeleghem et al., 1996; Vierula et al., 1996; Vine et al., 1996; Auger and Jouannet, 1997; Jensen et al., 1997; Lemcke et al., 1997; Handelsman, 1997a,b; Chia et al., 1998; Muller et al., 1998; Naz et al., 1998; Gyllenborg et al., 1999; Kolstad et al., 1999; Kuroki et al., 1999; Larsen et al., 1999; Purakayastha et al., 1999; Reddy and Bordekar, 1999; De Celis et al., 2000; Glazier et al., 2000; Mak et al., 2000; Selevan et al., 2000; Wiltshire et al., 2000; Zhang et al., 2000; Foppiani et al., 2001; Guzick et al., 2001; Hammadeh et al., 2001; Jorgensen et al., 2001, 2002, 2011, 2012; Kelleher et al., 2001; Lee and Coughlin, 2001; Patankar et al., 2001; Tambe et al., 2001; Xiao et al., 2001; Costello et al., 2002; Junqing et al., 2002; Kukuvitis et al., 2002; Luetjens et al., 2002; Punab et al., 2002; Richthoff et al., 2002; Danadevi et al., 2003; de Gouveia Brazao et al., 2003; Firman et al., 2003; Liu et al., 2003; Lundwall et al., 2003; Roste et al., 2003; Serra-Majem et al., 2003; Uhler et al., 2003; Xu et al., 2003; Ebesunun et al., 2004; Rintala et al., 2004; Toft et al., 2004, 2005; Bang et al., 2005; Mahmoud et al., 2005; Muthusami and Chinnaswamy, 2005; O'Donovan, 2005; Tsarev et al., 2005, 2009; Durazzo et al., 2006; Fetic et al., 2006; Giagulli and Carbone, 2006; Haugen et al., 2006; Iwamoto et al., 2006, 2013a,b; Pal et al., 2006; Yucra et al., 2006; Aneck-Hahn et al., 2007; Garcia et al., 2007; Multigner et al., 2007; Plastira et al., 2007; Rignell-Hydbom et al., 2007; Wu et al., 2007; Akutsu et al., 2008; Bhattacharya, 2008; Gallegos et al., 2008; Goulis et al., 2008; Jedrzejczak et al., 2008; Kobayashi et al., 2008; Korrovits et al., 2008; Li and Gu, 2008; Lopez-Teijon et al., 2008; Paasch et al., 2008; Peters et al., 2008; Recabarren et al., 2008; Recio-Vega et al., 2008; Saxena et al., 2008; Shine et al., 2008; Andrade-Rocha, 2009; Kumar et al., 2009, 2011; Rylander et al., 2009; Stewart et al., 2009; Vani et al., 2009, 2012; Verit et al., 2009; Engelbertz et al., 2010; Hossain et al., 2010; Ortiz et al., 2010; Rubes et al., 2010; Tirumala Vani et al., 2010; Al Momani et al., 2011; Auger and Eustache, 2011; Axelsson et al., 2011; Brahem et al., 2011; Jacobsen et al., 2011; Khan et al., 2011; Linschooten et al., 2011; Venkatesh et al., 2011; Vested et al., 2011; Absalan et al., 2012; Al-Janabi et al., 2012; Katukam et al., 2012; Mostafa et al., 2012; Nikoobakht et al., 2012; Rabelo-Junior et al., 2012; Splingart et al., 2012; Bujan et al., 2013; Girela et al., 2013; Halling et al., 2013; Ji et al., 2013; Mendiola et al., 2013; Redmon et al., 2013; Thilagavathi et al., 2013; Valsa et al., 2013; Zalata et al., 2013; Zareba et al., 2013; Huang et al., 2014).

PRISMA Flow chart showing the selection of studies eligible for meta-regression analysis.
Simple linear models
Combining results from all four groups of men SC declined significantly (slope per year −0.70 million/ml; 95% CI: −0.72 to −0.69; P < 0.001) over the study period when using simple linear models (unadjusted, weighted by sample size) (Fig. 2a). SC declined by 0.75% per year (95% CI: 0.73–0.77%) and overall by 28.5% between 1973 and 2011. A similar trend was seen for TSC (slope per year = −2.23 million; 95% CI: −2.31 to −2.16; P < 0.001) (Fig. 2b), corresponding to a decline in TSC of 0.75% per year (95% CI: 0.72–0.78%), and 28.5% overall. Semen volume (156 estimates), did not change significantly over the study period (slope per year = 0.0003 ml; 95% CI: −0.0003 to 0.0008; P = 0.382).

(a) Mean sperm concentration by year of sample collection in 244 estimates collected in 1973–2011 and simple linear regression. (b) Mean total sperm count by year of sample collection in 244 estimates collected in 1973–2011 and simple linear regression.

(a) Meta-regression model for mean sperm concentration by fertility and geographic groups, adjusted for potential confounders. (b) Meta-regression model for mean total sperm count by fertility and geographic groups, adjusted for potential confounders. Meta-regression model weighted by sperm concentration (SC) SE, adjusted for fertility group, time × fertility group interaction, geographic group, time × geographic group interaction, age, abstinence time, semen collection method reported, counting method reported, having more than one sample per men, indicators for study selection of population and exclusion criteria (some vasectomy candidates, some semen donor candidates, exclusion of men with chronic diseases, exclusion by other reasons not related to fertility, selection by occupation not related to fertility), whether year of collection was estimated, whether arithmetic mean of SC was estimated, whether SE of SC was estimated and indicator variable to denote studies with more than one estimate. Total sperm count (TSC) meta-regression models weighted by TSC SE, adjusted for similar covariates and method used to assess semen volume.
Meta-regression models
We ran meta-regression models, unadjusted and adjusted, with and without interaction terms for fertility and geographic groups (Supplementary Table S2). In the simple meta-regression model for SC, in which estimates were weighted by their SE but without covariate adjustment, slopes were similar to those for simple regression, but with wider CIs (SC slope = −0.68; −0.99 to −0.37; P < 0.001). Covariate adjustment did not appreciably alter the slope but widened the CI further (−0.64; −1.06 to −0.22; P = 0.003).
Slopes were significantly modified by the interaction of time with both fertility and geographic group. The three-way interaction term (time × fertility group × geographic group) was not significant (P = 0.57) and was not included in final models. In the final adjusted models for SC, which included two interaction terms [time × fertility group (P = 0.064) and time × geographic group (P = 0.027)], significant declines were seen among both Unselected Western (−1.38 million/ml/year, −2.02 to −0.74; P < 0.001) and Fertile Western (−0.68, −1.31 to −0.05; P = 0.033) (Table I, Fig. 3a), with a steeper slope for Unselected Western. Using estimates from the fully adjusted model of 99.0 million/ml in 1973 to 47.1 million/ml in 2011, SC in the Unselected Western group declined 1.4% per year and overall by 52.4% between 1973 and 2011.
Sperm concentration and total sperm count in first and last years of meta-regression analysis with percentage change and slope per year, for all men and by fertility and geographic groupsa.
Category . | N (estimates) . | First year . | First year SC (million/ml) . | Last year . | Last year SC (million/ml) . | Percentage change/year . | Slope (95% CI), million/ml/year . |
---|---|---|---|---|---|---|---|
All men | 244 | 1973 | 92.8 | 2011 | 66.4 | −0.75 | −0.70 (−0.72 to −0.69) |
Unselected Western | 110 | 1973 | 99.0 | 2011 | 47.1 | −1.40 | −1.38 (−2.02 to −0.74) |
Fertile Western | 65 | 1977 | 83.8 | 2009 | 62.0 | −0.81 | −0.68 (−1.31 to −0.05) |
Unselected Other | 30 | 1986 | 72.7 | 2010 | 62.6 | −0.58 | −0.42 (−1.24 to 0.40) |
Fertile Other | 39 | 1978 | 66.4 | 2011 | 75.7 | 0.42 | 0.28 (−0.44 to 1.00) |
Category . | N (estimates) . | First year . | First year SC (million/ml) . | Last year . | Last year SC (million/ml) . | Percentage change/year . | Slope (95% CI), million/ml/year . |
---|---|---|---|---|---|---|---|
All men | 244 | 1973 | 92.8 | 2011 | 66.4 | −0.75 | −0.70 (−0.72 to −0.69) |
Unselected Western | 110 | 1973 | 99.0 | 2011 | 47.1 | −1.40 | −1.38 (−2.02 to −0.74) |
Fertile Western | 65 | 1977 | 83.8 | 2009 | 62.0 | −0.81 | −0.68 (−1.31 to −0.05) |
Unselected Other | 30 | 1986 | 72.7 | 2010 | 62.6 | −0.58 | −0.42 (−1.24 to 0.40) |
Fertile Other | 39 | 1978 | 66.4 | 2011 | 75.7 | 0.42 | 0.28 (−0.44 to 1.00) |
Category | N (estimates) | First Year | First year TSC (million) | Last year | Last year TSC (million) | Percentage change/year | Slope (95% CI), million/year |
All men | 244 | 1973 | 295.7 | 2011 | 212.0 | −0.75 | −2.23 (−2.31 to −2.16) |
Unselected Western | 110 | 1973 | 337.5 | 2011 | 137.5 | −1.58 | −5.33 (−7.56 to −3.11) |
Fertile Western | 65 | 1977 | 277.4 | 2009 | 209.5 | −0.76 | −2.12 (−4.31 to 0.07) |
Unselected Other | 30 | 1986 | 212.4 | 2010 | 167.3 | −0.88 | −1.88 (−4.77 to 1.01) |
Fertile Other | 39 | 1978 | 189.2 | 2011 | 233.2 | 0.70 | 1.33 (−1.20 to 3.86) |
Category | N (estimates) | First Year | First year TSC (million) | Last year | Last year TSC (million) | Percentage change/year | Slope (95% CI), million/year |
All men | 244 | 1973 | 295.7 | 2011 | 212.0 | −0.75 | −2.23 (−2.31 to −2.16) |
Unselected Western | 110 | 1973 | 337.5 | 2011 | 137.5 | −1.58 | −5.33 (−7.56 to −3.11) |
Fertile Western | 65 | 1977 | 277.4 | 2009 | 209.5 | −0.76 | −2.12 (−4.31 to 0.07) |
Unselected Other | 30 | 1986 | 212.4 | 2010 | 167.3 | −0.88 | −1.88 (−4.77 to 1.01) |
Fertile Other | 39 | 1978 | 189.2 | 2011 | 233.2 | 0.70 | 1.33 (−1.20 to 3.86) |
aFor all men: simple linear regression weighted by sample size. For all other categories: Meta-regression model weighted by sperm concentration (SC) SE, adjusted for fertility group, time x fertility group interaction, geographic group, time × geographic group interaction, age, abstinence time, semen collection method reported, counting method reported, having more than one sample per men, indicators for study selection of population and exclusion criteria (some vasectomy candidates, some semen donor candidates, exclusion of men with chronic diseases, exclusion by other reasons not related to fertility, selection by occupation not related to fertility), whether year of collection was estimated, whether arithmetic mean of SC was estimated, whether SE of SC was estimated and indicator variable to denote studies with more than one estimate. Total sperm count (TSC) meta-regression models weighted by TSC SE, adjusted for similar covariates and method used to assess semen volume.
Sperm concentration and total sperm count in first and last years of meta-regression analysis with percentage change and slope per year, for all men and by fertility and geographic groupsa.
Category . | N (estimates) . | First year . | First year SC (million/ml) . | Last year . | Last year SC (million/ml) . | Percentage change/year . | Slope (95% CI), million/ml/year . |
---|---|---|---|---|---|---|---|
All men | 244 | 1973 | 92.8 | 2011 | 66.4 | −0.75 | −0.70 (−0.72 to −0.69) |
Unselected Western | 110 | 1973 | 99.0 | 2011 | 47.1 | −1.40 | −1.38 (−2.02 to −0.74) |
Fertile Western | 65 | 1977 | 83.8 | 2009 | 62.0 | −0.81 | −0.68 (−1.31 to −0.05) |
Unselected Other | 30 | 1986 | 72.7 | 2010 | 62.6 | −0.58 | −0.42 (−1.24 to 0.40) |
Fertile Other | 39 | 1978 | 66.4 | 2011 | 75.7 | 0.42 | 0.28 (−0.44 to 1.00) |
Category . | N (estimates) . | First year . | First year SC (million/ml) . | Last year . | Last year SC (million/ml) . | Percentage change/year . | Slope (95% CI), million/ml/year . |
---|---|---|---|---|---|---|---|
All men | 244 | 1973 | 92.8 | 2011 | 66.4 | −0.75 | −0.70 (−0.72 to −0.69) |
Unselected Western | 110 | 1973 | 99.0 | 2011 | 47.1 | −1.40 | −1.38 (−2.02 to −0.74) |
Fertile Western | 65 | 1977 | 83.8 | 2009 | 62.0 | −0.81 | −0.68 (−1.31 to −0.05) |
Unselected Other | 30 | 1986 | 72.7 | 2010 | 62.6 | −0.58 | −0.42 (−1.24 to 0.40) |
Fertile Other | 39 | 1978 | 66.4 | 2011 | 75.7 | 0.42 | 0.28 (−0.44 to 1.00) |
Category | N (estimates) | First Year | First year TSC (million) | Last year | Last year TSC (million) | Percentage change/year | Slope (95% CI), million/year |
All men | 244 | 1973 | 295.7 | 2011 | 212.0 | −0.75 | −2.23 (−2.31 to −2.16) |
Unselected Western | 110 | 1973 | 337.5 | 2011 | 137.5 | −1.58 | −5.33 (−7.56 to −3.11) |
Fertile Western | 65 | 1977 | 277.4 | 2009 | 209.5 | −0.76 | −2.12 (−4.31 to 0.07) |
Unselected Other | 30 | 1986 | 212.4 | 2010 | 167.3 | −0.88 | −1.88 (−4.77 to 1.01) |
Fertile Other | 39 | 1978 | 189.2 | 2011 | 233.2 | 0.70 | 1.33 (−1.20 to 3.86) |
Category | N (estimates) | First Year | First year TSC (million) | Last year | Last year TSC (million) | Percentage change/year | Slope (95% CI), million/year |
All men | 244 | 1973 | 295.7 | 2011 | 212.0 | −0.75 | −2.23 (−2.31 to −2.16) |
Unselected Western | 110 | 1973 | 337.5 | 2011 | 137.5 | −1.58 | −5.33 (−7.56 to −3.11) |
Fertile Western | 65 | 1977 | 277.4 | 2009 | 209.5 | −0.76 | −2.12 (−4.31 to 0.07) |
Unselected Other | 30 | 1986 | 212.4 | 2010 | 167.3 | −0.88 | −1.88 (−4.77 to 1.01) |
Fertile Other | 39 | 1978 | 189.2 | 2011 | 233.2 | 0.70 | 1.33 (−1.20 to 3.86) |
aFor all men: simple linear regression weighted by sample size. For all other categories: Meta-regression model weighted by sperm concentration (SC) SE, adjusted for fertility group, time x fertility group interaction, geographic group, time × geographic group interaction, age, abstinence time, semen collection method reported, counting method reported, having more than one sample per men, indicators for study selection of population and exclusion criteria (some vasectomy candidates, some semen donor candidates, exclusion of men with chronic diseases, exclusion by other reasons not related to fertility, selection by occupation not related to fertility), whether year of collection was estimated, whether arithmetic mean of SC was estimated, whether SE of SC was estimated and indicator variable to denote studies with more than one estimate. Total sperm count (TSC) meta-regression models weighted by TSC SE, adjusted for similar covariates and method used to assess semen volume.
In the final adjusted models for TSC (Table I), which included time × fertility group (P = 0.014) and time × geographic group (P = 0.021), in Western studies a steeper slope in TSC was seen among Unselected (−5.33 million/year, −7.56 to −3.11; P < 0.001) versus Fertile (−2.12, −4.31 to 0.07; P = 0.057) (Table I, Fig. 3b). Using estimates from the fully adjusted model of 337.5 million in 1973 to 137.5 million in 2011, TSC in the Unselected Western group declined 1.6% per year and overall by 59.3% between 1973 and 2011.
No significant trends in SC or TSC were seen in Other countries overall, or for Unselected or Fertile men separately.
Sensitivity analyses
We performed multiple analyses to examine the sensitivity of results to assumptions about our model, influence of covariates, estimation of missing data, trends in SEs and study period. For the sake of brevity, results from sensitivity analyses are presented here for slope of SC in Unselected Western group. In all sensitivity analyses there was a significant (P < 0.01) and strong (>1.0 million/ml/year) decline for Unselected Western group.
– Adding a quadratic or cubic function of year to meta-regression models did not substantially change the shape of the trend or improve model fit (as adjusted R-square declined), overall or within any of the geographic or fertility groups (coefficient for the quadratic term: 0.0009; 95% CI: −0.04 to 0.05, P = 0.969; for the cubic term −0.0003; 95% CI: −0.0007 to 0.0007, P = 0.942).
– Results of sensitivity analyses excluding a specific group for each covariate, or removing each covariate at a time from the model are in Supplementary Table S3.
– After excluding nine estimates with a SE of SC > 20 million/ml, the slope for Unselected Western was −1.31 million/ml (−1.96 to −0.66; P = 0.001).
– Excluding 85 studies with no data on mean age and adjusting for mean age instead of age group, yielded a slope of −1.68 million/ml (−2.35 to −1.01; P < 0.001).
– The proportion of smokers was reported in only 25% of studies. To examine this variable a sensitivity analysis including a covariate for high proportion of smokers (>30%) was performed, and slopes changed only slightly (−1.39 million/ml, −2.03 to −0.75; P < 0.001).
– The slope for Unselected Western did not change appreciably after excluding each country/region with more than 10 estimates at a time. Excluding 28 estimates from Australia and New Zealand, the slope for studies of unselected men from North America/Europe was −1.13 million/ml (−1.79 to −0.47; P = 0.001). Excluding estimates from the USA (n = 39) or Denmark (n = 19) the slopes were −1.46 million/ml (−2.25 to −0.67; P < 0.001) and −1.57 million/ml (−2.26 to −0.87; P < 0.001), respectively.
– Restricting the analysis to data from recent years (196 estimates collected post-1985) the slope (−1.57 million/ml, −2.51 to −0.62; P = 0.001) was similar to that for the full model. Restricting the analysis to data post-1995 (model restricted to 53 estimates of Unselected Western due to insufficient observations for interaction terms) the slope (−2.06 million/ml, −3.38 to −0.74; P = 0.004) was somewhat steeper.
Results for TSC slope were also robust in all sensitivity analyses. Restricting the analysis to data post-1995 the slope (−8.12 million, −13.73 to −2.51; P = 0.006) was somewhat steeper.
Discussion
Key findings
In this first systematic review and meta-regression analysis of temporal trends in sperm counts we report a significant overall decline in both SC and TSC in samples collected between 1973 and 2011. Declines were significant only in studies from North America, Europe, Australia (and New Zealand), where they were most pronounced among men unselected by fertility. In this latter group, SC declined 52.4% (−1.4% per year) and TSC 59.3% (−1.6% per year) over the study period. These slopes remained substantially unchanged after controlling for multiple preselected covariates (age, abstinence time, method of semen collection, method of counting sperm, selection of population and study exclusion criteria, number of samples per man and completeness of data) and in multiple sensitivity analyses. Thus, these data provide robust indication for a decline in SC and TSC in North America, Europe, Australia and New Zealand over the last 4 decades. There was no sign of ‘leveling off’ of the decline, when analyses were restricted to studies with sample collection in 1996–2011.
Comparison to previous studies
The overall decline in SC reported here (−0.70 million/ml/year) was consistent with, but not as steep as (−0.93 and −0.94 million/ml/year), previously reported for an earlier period (Carlsen et al., 1992; Swan et al., 1997, 2000) (Table II). The annual percentage change in SC reported here was −0.75% million/ml, comparable to −0.83% reported by Carlsen et al. (1992). As in prior analyses (Swan et al., 1997, 2000), we saw no significant declines for studies from South America, Asia and Africa, which may, in part be accounted for by limited statistical power and an absence of studies in unselected men from these countries prior to 1985. However, we note that the modification of the slope by geographic group was significant. Thus, based on the results presented here, while it is not possible to rule out a trend in non-Western countries, these data do not support a decline as steep as that observed in Western countries. In the current analysis, declines in North America and Europe/Australia were similar, unlike prior analyses which included a higher proportion of studies from North America (Swan et al., 1997, 2000).
Characteristics and results of fitting a simple linear regression model (without adjustment, weighted by sample size) for trends of sperm concentration in the current study, in Carlsen et al. (1992), and in Swan et al. (2000).
Study . | Levine et al. (2017, current study) . | Carlsen et al. (1992) . | Swan et al. (2000) . |
---|---|---|---|
Publication years | 1981–2013 | 1938–1990 | 1934–1996 |
Number of studies | 185 (244 estimates) | 61 | 101 |
Number of countries | 50 | 20 | 28 |
Fertility group: N (%) | |||
Fertile | 104 (43%) | 39 (64%) | 51 (50%)a |
Unselected | 140 (57%) | 22 (36%) | 50 (50%) |
Geographic group: N (%) | |||
Westernb | 175 (72%) | 45 (74%) | 78 (77%) |
Other | 69 (28%) | 16 (26%) | 23 (23%) |
Slope | −0.70 | −0.93 | −0.94 |
P-value | <0.001 | <0.001 | <0.001 |
Study . | Levine et al. (2017, current study) . | Carlsen et al. (1992) . | Swan et al. (2000) . |
---|---|---|---|
Publication years | 1981–2013 | 1938–1990 | 1934–1996 |
Number of studies | 185 (244 estimates) | 61 | 101 |
Number of countries | 50 | 20 | 28 |
Fertility group: N (%) | |||
Fertile | 104 (43%) | 39 (64%) | 51 (50%)a |
Unselected | 140 (57%) | 22 (36%) | 50 (50%) |
Geographic group: N (%) | |||
Westernb | 175 (72%) | 45 (74%) | 78 (77%) |
Other | 69 (28%) | 16 (26%) | 23 (23%) |
Slope | −0.70 | −0.93 | −0.94 |
P-value | <0.001 | <0.001 | <0.001 |
aWife pregnant or post-partum or at least 90% of men with proven fertility.
bWestern includes studies from North America, Europe and Australia (and New Zealand). Other includes studies from all other countries.
Characteristics and results of fitting a simple linear regression model (without adjustment, weighted by sample size) for trends of sperm concentration in the current study, in Carlsen et al. (1992), and in Swan et al. (2000).
Study . | Levine et al. (2017, current study) . | Carlsen et al. (1992) . | Swan et al. (2000) . |
---|---|---|---|
Publication years | 1981–2013 | 1938–1990 | 1934–1996 |
Number of studies | 185 (244 estimates) | 61 | 101 |
Number of countries | 50 | 20 | 28 |
Fertility group: N (%) | |||
Fertile | 104 (43%) | 39 (64%) | 51 (50%)a |
Unselected | 140 (57%) | 22 (36%) | 50 (50%) |
Geographic group: N (%) | |||
Westernb | 175 (72%) | 45 (74%) | 78 (77%) |
Other | 69 (28%) | 16 (26%) | 23 (23%) |
Slope | −0.70 | −0.93 | −0.94 |
P-value | <0.001 | <0.001 | <0.001 |
Study . | Levine et al. (2017, current study) . | Carlsen et al. (1992) . | Swan et al. (2000) . |
---|---|---|---|
Publication years | 1981–2013 | 1938–1990 | 1934–1996 |
Number of studies | 185 (244 estimates) | 61 | 101 |
Number of countries | 50 | 20 | 28 |
Fertility group: N (%) | |||
Fertile | 104 (43%) | 39 (64%) | 51 (50%)a |
Unselected | 140 (57%) | 22 (36%) | 50 (50%) |
Geographic group: N (%) | |||
Westernb | 175 (72%) | 45 (74%) | 78 (77%) |
Other | 69 (28%) | 16 (26%) | 23 (23%) |
Slope | −0.70 | −0.93 | −0.94 |
P-value | <0.001 | <0.001 | <0.001 |
aWife pregnant or post-partum or at least 90% of men with proven fertility.
bWestern includes studies from North America, Europe and Australia (and New Zealand). Other includes studies from all other countries.
Owing to the completeness of our search, our considerable sample size across the entire study period and use of meta-regression methods, this analysis avoids many of the limitations of previous studies. The study of Carlsen et al. (1992), which weighted studies by sample size, was criticized for having one study that included 30% of all subjects and for the paucity of data in the first 30 years of the analysis (Olsen et al., 1995). The largest study in the current meta-regression analysis included only 5% of all subjects, sensitivity analyses demonstrated that no one country drove the overall trend, and studies were well distributed over the 39 years of the study period and among 50 different countries. Furthermore, the meta-regression methods utilized in the current study addressed the issue of heterogeneity in the reliability of study estimates by weighting of estimates by their SE. This conservative method inflates the CI and is appropriate when the number of studies is sufficiently large, as it was in our analysis (Baker and Jackson, 2010). In addition, we adjusted for a pre-determined set of covariates, as well as variables indicating data completeness and study exclusion criteria, thus avoiding the main pitfall in reaching reliable conclusions from meta-regression analyses (Thompson and Higgins, 2002).
Our statistical power enabled us to assess modification by fertility and geographic group. Modification by fertility group is especially important since fertile men represent a selected population, while unselected men are more likely to be representative of the general population.
Some researchers have criticized the use of sperm count estimates from the past arguing that greater measurement error would be expected in historical studies. This is an unlikely explanation for the trend we report here for several reasons. First, unlike earlier analyses that included studies in which samples were collected as far back as 1931, our analysis includes studies with samples collected only since 1973. Even if measurements were less reliable in the past, this greater imprecision would produce greater uncertainty in earlier studies but not a change in slope. Further, since we weighted estimates by their SE, we avoided this hypothetical limitation. In addition, results were robust in sensitivity analyses that excluded studies in which SE was estimated, or very large.
Chance is an unlikely explanation for our results, which were significant even in the more conservative meta-regression models. We used written protocols and extensive quality control procedures to minimize potential information and selection bias in all steps of the study.
Limitations
There are several possible limitations to this systematic review and meta-regression analysis. It is possible that failure to include non-English publications may have limited our analyses of non-Western countries. It has been claimed that men who are willing to provide semen sample may differ from the rest of the population leading to potential selection bias, but current evidence does not support this claim (Cooper et al., 2010).
We analyzed sperm counts (both by SC and TSC) but not sperm motility and morphology because information regarding motility and morphology were seldom available in older studies. Moreover, the recommended methods and criteria for motility and morphology assessments have changed significantly over time making across-time comparisons difficult. In contrast, the assessment of SC by hemocytometer, first described in 1902 (Benedict, 1902), has been the method recommended by the World Health Organization since 1980 (World Health Organization, 2010), and there is no evidence that this method has varied systematically over time. For these reasons SC is considered to be the most reliable endpoint for epidemiological analysis (Le Moal et al., 2016). Because of this stability and the variability of other counting methods over time we only included studies in which counting was done (or likely done) by hemocytometer and excluded studies that used alternative counting chambers (e.g. Makler, Coulter and Microcell) or non-manual methods (i.e. computer assisted sperm analysis or flow cytometry). Even though we followed detailed protocol, this study was not preregistered in Prospero.
Analysing trends by birth cohorts instead of year of sample collection may aid in assessing the causes of the decline (prenatal or in adult life) but was not feasible owing to lack of information.
Wider implications
This rigorous and comprehensive analysis finds that SC declined 52.4% between 1973 and 2011 among unselected men from Western countries, with no evidence of a ‘leveling off’ in recent years. Declining mean SC implies that an increasing proportion of men have sperm counts below any given threshold for sub-fertility or infertility. The high proportion of men from western countries with concentration below 40 million/ml is particularly concerning given the evidence that SC below this threshold is associated with a decreased monthly probability of conception (Bonde et al., 1998).
Declines in sperm count have implications beyond fertility and reproduction. The decline we report here is consistent with reported trends in other male reproductive health indicators, such as testicular germ cell tumors, cryptorchidism, onset of male puberty and total testosterone levels (Skakkebaek et al., 2016). The public health implications are even wider. Recent studies have shown that poor sperm count is associated with overall morbidity and mortality (Jensen et al., 2009; Eisenberg et al., 2014b, 2016; Latif et al., 2017). While the current study is not designed to provide direct information on the causes of the observed declines, sperm count has been plausibly associated with multiple environmental and lifestyle influences, both prenatally and in adult life. In particular, endocrine disruption from chemical exposures or maternal smoking during critical windows of male reproductive development may play a role in prenatal life, while lifestyle changes and exposure to pesticides may play a role in adult life. Thus, a decline in sperm count might be considered as a ‘canary in the coal mine’ for male health across the lifespan. Our report of a continuing and robust decline should, therefore, trigger research into its causes, aiming for prevention.
Conclusion
In this comprehensive meta-analysis, sperm counts whether measured by SC or TSC declined significantly among men from North America, Europe and Australia during 1973–2011, with a 50–60% decline among men unselected by fertility, with no evidence of a ‘leveling off’ in recent years. These findings strongly suggest a significant decline in male reproductive health, which has serious implications beyond fertility concerns. Research on causes and implications of this decline is urgently needed.
Supplementary data
Supplementary data are available at Human Reproduction Update online.
Acknowledgments
We thank Charles Poole, ScD, University of North Carolina School of Public Health, Chapel Hill, USA, for his contribution to protocol development and meta-regression analysis planning. We thank John D. Meyer, MD, MPH, Icahn School of Medicine at Mount Sinai, USA, for his contribution to abstract screening. We thank Haim Ricas, MSc, Tashtit Scientific Consultants, Israel, for his assistance with preparation of figures.
Authors' roles
H.L. had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: H.L. and S.H.S. Search strategy design and execution: H.L., R.P. Acquisition, analysis, or interpretation of data: H.L., N.J., A.M.A., J.M., D.W.D., I.M., R.P. and S.H.S. Drafting of the article: H.L. and S.H.S. Critical revision of the article for important intellectual content: H.L., N.J., A.M.A., J.M., D.W.D., I.M., R.P. and S.H.S. Statistical analysis: HL. Administrative, technical or material support: H.L., R.P. and S.H.S. Study supervision: H.L. and S.H.S.
Funding
Environment and Health Fund (EHF), Jerusalem, Israel, and supplementary support from American Healthcare Professionals and Friends for Medicine in Israel (APF) and Israel Medical Association (IMA) to H.L. Research Fund of Rigshospitalet (Grant no. R42-A1326) to N.J. The Brazilian National Council for Scientific and Technological Development (CNPq 249184/2013-3) to A.M.A. The Mount Sinai Transdisciplinary Center on Early Environmental Exposures (NIH P30ES023515) to S.S. The EHF, APF, IMA, Research Fund of Rigshospitalet, CNPq and The Mount Sinai Transdisciplinary Center had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparations, review or approval of the article; and decision to submit the article for publication.
Conflict of interest
All authors declare no conflict of interest.