Excess mortality during the COVID-19 outbreak in Italy: a two-stage interrupted time-series analysis

Abstract Background Italy was the first country outside China to experience the impact of the COVID-19 pandemic, which resulted in a significant health burden. This study presents an analysis of the excess mortality across the 107 Italian provinces, stratified by sex, age group and period of the outbreak. Methods The analysis was performed using a two-stage interrupted time-series design using daily mortality data for the period January 2015–May 2020. In the first stage, we performed province-level quasi-Poisson regression models, with smooth functions to define a baseline risk while accounting for trends and weather conditions and to flexibly estimate the variation in excess risk during the outbreak. Estimates were pooled in the second stage using a mixed-effects multivariate meta-analysis. Results In the period 15 February–15 May 2020, we estimated an excess of 47 490 [95% empirical confidence intervals (eCIs): 43 984 to 50 362] deaths in Italy, corresponding to an increase of 29.5% (95% eCI: 26.8 to 31.9%) from the expected mortality. The analysis indicates a strong geographical pattern, with the majority of excess deaths occurring in northern regions, where few provinces experienced increases up to 800% during the peak in late March. There were differences by sex, age and area both in the overall impact and in its temporal distribution. Conclusion This study offers a detailed picture of excess mortality during the first months of the COVID-19 pandemic in Italy. The strong geographical and temporal patterns can be related to the implementation of lockdown policies and multiple direct and indirect pathways in mortality risk.


Introduction
The outbreak of the SARS-CoV-2 coronavirus originated in Wuhan, China, in December 2019 and quickly affected most countries worldwide in the first months of 2020. The infection causes COVID-19 disease-a severe acute respiratory syndrome that can lead to hospitalization and eventually death. 1,2 As of 6 July 2020, almost 11.5 million cases of COVID-19 have been reported worldwide, resulting in 535 027 official deaths. 3 Italy has been one of the worst-affected countries in the first months of the pandemic. The first case of SARS-CoV-2 infection not originating in China was reported on 20 February in the province of Lodi and the disease quickly spread across the north of the country. 4 The number of reported cases with a validated diagnostic test rose to 1128 by the end of the month and peaked on 21 March with 6557 cases per day. 4 A series of containment policies, both at national and local levels, have been implemented since the start of the outbreak. The Italian government declared the quarantine of 11 municipalities in Northern Italy on 21 February, which was then extended to the entire region of Lombardy on 8 March and finally to the whole country the next day. Strict lockdown measures were implemented thereafter, including social distancing, travel restrictions, closure of all non-essential businesses and industries and stay-at-home orders. 5 Most of the strictest measures were not released until early May.
The pandemic put enormous pressure on the health system and it was particularly severe in Lombardy and a few other selected provinces in other northern regions, where hospitals were overwhelmed. Emergency and intensivecare units focused almost exclusively on COVID-19 patients, with admissions for other causes substantially reduced, whereas general-practice and outpatients activities were in many cases halted to prevent the transmission of the virus. 6,7 The first fatality officially attributed to COVID-19 in Italy occurred in Veneto on 21 February. The number of deaths then started rising and, by 15 May, 31 610 deaths related to COVID-19 were reported. 4 Previous assessment on the health impact focused on excess in all-cause mortality 8-10 -an indicator that offers a quantification of the overall burden of the disease, accounting for both direct and indirect effects. 11 However, these analyses were limited to urban areas or used data aggregated by region, and they adopted relatively simple designs based on pre-post comparisons that do not account for temporal changes in risk. These can be due to seasonality or long-term trends in mortality and to yearly variation in risk factors such as weather. In particular, the mild winter of 2020 could have resulted in a lower mortality burden attributed to cold, 12 therefore biasing downwards the quantification of excess mortality.
In this contribution, we performed an analysis of the excess mortality during the first months of the SARS-CoV-2 coronavirus outbreak across the 107 Italian provinces, to an increase of 29.5% (95% eCI: 26.8 to 31.9%) from the expected mortality.
• There was a strong geographical pattern, with 71.0% of the estimated excess deaths occurring in just three northern regions (Lombardy, Veneto and Emilia-Romagna) and few provinces showing increases in mortality up to 800% during the peak of the pandemic.
• The impact was slightly higher in men compared with women, with 24 655 and 23 125 excess deaths, respectively.
The risk varied by age, with the highest excess mortality in the group of people aged 70-79 years old and with a lower but measurable risk evident in those aged <60.
• The analysis by week suggests differential trends, with more delayed impacts in women and the elderly, and with the risk limited to a shorter period (March to mid-April) in Central and Southern Italy.
• These differences are likely related to the implementation of lockdown policies and to contributions from differential risk pathways, including deaths directly related to COVID-19 and indirect mortality due to other factors such as disruptions in the healthcare system. stratified by sex, age group and period of the outbreak. The assessment is based on an official mortality data set released by the Italian Institute of Statistics (ISTAT) and it benefits from the application of a novel two-stage interrupted time-series design and flexible statistical methods to control for trends and variations risk factors. This study can therefore offer a comprehensive overview of the mortality impact of the COVID-19 pandemic in Italy.

Data
The analysis is based on the database released by ISTAT on 18 June 2020, 13 which provides the number of daily deaths stratified by sex and age groups for 7904 municipalities in 107 provinces and 20 regions of Italy within the period 1 January 2015-15 May 2020 ( Table 1). The database includes complete daily counts for 2015-2019 and early-release data for the 4.5 months of 2020 for 7270 municipalities (92% of the total) for which linkage with death registration was up to date. The mortality counts were aggregated by province in sex-and age-specific daily time series, and linked with daily mean-temperature data at the population-weighted centroid of the province from the ERA-5 reanalysis data set on the Copernicus climate data store, 14 and weekly incidence of influenza cases at the regional level determined by laboratory tests. 15

Two-stage analysis
The statistical analysis was performed by applying a twostage interrupted time-series model to quantify the timevarying excess risk for mortality during the COVID-19 outbreak compared with the pre-outbreak period, accounting for temporal trends and variation in other risk factors. In the first stage, at the province level, we applied a quasi-Poisson time-series regression model with smooth spline functions for time variables and observed predictors. 16 Specifically, we included a linear term for time to model long-term trends, a cyclic cubic B-spline with three equally spaced knots for the day of the year to model seasonality, and indicators for day of the week to take into account weekly variations in mortality. The excess risk in mortality during the COVID-19 outbreak was defined through a constrained quadratic B-spline with four equally spaced knots for the days from 1 February 2020 to 15 May 2020 (outbreak period). This function constrains the excess risk to start from null at the beginning of February and then allows it to vary flexibly until the end of the study period. Potential differences in the underlying mortality risk due to non-optimal weather between the pre-outbreak and The coverage represents the percentage of municipalities by region with data available for the year 2020. outbreak periods were controlled by including a term for the mean daily temperature. This was defined through a distributed lag non-linear model over 0-21 lag days, 17 using a cross-basis parameterization defined in previous studies. 12 It was decided not to control for influenza in the main model, as laboratory testing could be affected by the pandemic, thus resulting in an artificial reduction in flu incidence. This choice was tested in a sensitivity analysis, in which we controlled for the effect of influenza using a 2-week moving average of the reconstructed daily incidence. The province-specific coefficients of the function defining the excess risk were then pooled in the second stage using a mixed-effects multivariate meta-analysis 18 and the best linear unbiased predictions for each of the 107 provinces were extracted. This approach allows borrowing strengths in the estimates across provinces, while at the same time modelling flexible associations of multi-parameter functions. 19 Quantification of excess mortality The model was performed on the province-specific aggregated series composed by the 7270 municipalities with full data and the estimates were used to compute the relative risk (RR) representing the excess in each province for each day of the outbreak period. We then reconstructed the complete mortality series for all the municipalities by using the data in the period 2015-2019 to compute, in each province, the proportion of deaths occurring in municipalities with full data and then applying the same proportion to the 4 months of 2020. The daily number of excess deaths was computed as (RR -1)/RR*n, where n is the daily number of deaths, and aggregated by week during the outbreak period (from 1 February) and then in total for the period with reported COVID-19 mortality (from 15 February). The excess number and fraction of deaths were then computed for each province, region, area (North, Central, South and Islands) and for the whole of Italy. We computed empirical confidence intervals (eCIs) at 95% by Monte Carlo simulation of the coefficients of the quadratic B-spline modelling the excess risk, assuming a multivariate normal distribution using the point estimates and (co)variance matrix. Sex-and age-specific models were run separately to compute the excess across subgroups of the population.
Information on repositories, software and code for accessing the data and replicating the analysis are provided in the 'Data Availability' section below.

Results
In the 3 months from 15 February to 15 May 2020, 208 320 deaths were registered in Italy, with an estimated increase of 47 490 (95% eCI: 43 984 to 50 362) from the expected baseline number when accounting for temporal trends and differences in temperature distribution (Table 2).  This figure corresponds to a percentage excess of 29.5% (95% eCI: 26.8 to 31.9%). The excess shows a strong geographical pattern, with an extremely high rise in the number of deaths in the northern regions. Lombardy alone experienced a staggering 25 782 excess deaths and, together with two other northern regions (Veneto and Emilia-Romagna), accounts for 71.0% of all the excess mortality estimated in Italy. The maps in Figure 1 provide a better representation of the geographical distribution, showing a percentage increase in deaths that reaches almost 400% in a few selected northern provinces. The excess mortality is comparatively milder in Central Italy and considerably smaller or absent in the South and the Islands, although with a patchy distribution and isolated spots such as in the provinces of Pesaro-Urbino (East coast) and Crotone (the 'bottom of the foot' in the South). The increase in mortality is slightly higher in men compared with women, with a total of 24 655 and 23 125, corresponding to percentage increases of 32.1% (95% eCI: 28.6 to 34.8%) and 27.7% (95% eCI: 24.5 to 30.0%), respectively ( Table 2). Figure 2 reports the results stratified by age and sex, indicating a lower excess mortality in women in the age groups between 60 and 89 years old but almost identical to males in younger and older people. Excess mortality was substantially lower in people <60 years old, although this group still shows a percentage excess of 10.2% (95% eCI: -3.5 to 23.2%) in the outbreak period. The maps in Figure 1 suggest a similar geographical distribution of the excess deaths across sex and age groups. Figure 3 shows the trend in RR of mortality during the SARS-CoV-2 outbreak period (1 February-15 May) by sex, age group and area. The graphs suggest that the risk started increasing above the baseline at the beginning of March and that the peak was reached at the end of the same month, almost 20 days after the implementation of lockdown measures. Given the uneven geographical distribution of the risk, with the densely populated provinces of the North mostly affected, this increased risks translates into a larger total excess in mortality, which reached 78.7% (95% eCI: 70.6 to 85.5%) above the baseline in the week of 18-24 March (figures reported in the Supplementary Data, available as Supplementary data at IJE online). The risk then decreased during April, with evidence of an excess mortality until at least the first week of May. The temporal distribution of risk is similar across sex and age groups (top and middle panels), with an indication of a slight delayed peak for women and very elderly (90 years old and older). The RR for the <60 age group reaches a lower peak just above 1.2 and it returns to the baseline level by mid-April. Consistently with what has been discussed above, the analysis by geographical area (bottom panel) suggests a very high mortality risk in the North that across the region is more than twice at the peak (RR above 2.0), whereas other areas show milder increases and earlier peaks. Some provinces showed staggering increases, with the percentage excess in Bergamo reaching 858.7% (95% eCI: 771.9 to 969.5%) in the week of 18-24 March (see Supplementary Data, available as Supplementary data at IJE online). There is some indication of an RR <1 in the late stage of the epidemics in Central and Southern Italy, which can suggest a 'harvesting effect', whereby some of the deaths are only anticipated by a few weeks. 20 This phenomenon can be explored when data for a longer time period become available.
Controlling for the additional effect of influenza has little impact, with an estimated excess of 48 569 compared with 47 490 in the main model.
The full set of results, including the number and fraction of excess deaths (with 95% eCI) by geographical aggregation (provinces, region and full country), sex, age groups and period (15 February-15 May 2020, and then by week starting from 1 February), is provided in the Supplementary Data, available as Supplementary data at IJE online.

Conclusion
This study offers a detailed assessment of the mortality for all causes in Italy during the COVID-19 pandemic wave in the first months of 2020. The estimate of an excess of 47 490 deaths is far higher than the number of 31 610 deaths officially recorded as related to COVID-19 in the same months. The increased mortality risk shows a strong geographical pattern, with a few northern provinces carrying the majority of excess deaths. Stratified analyses indicate differences in the impact by sex and age, whereas the analysis of the trend shows that excess mortality peaked at the end of March. The most striking result of the analysis is the strong north-to-south geographical gradient in the impact of COVID-19 in Italy, consistent with previous studies from Italy. [8][9][10] The vast majority of excess deaths occurred in selected northern provinces. However, differently from what was previously reported, 10 this study is able to identify significant increases in many provinces and regions of the South as well. The reasons for such North-South differences are still debated. A role was likely being played by the timing and characteristics of the SARS-CoV-2 outbreak, which started in the provinces of the North with established trade connections with China and was amplified by the higher population density and internal travel. 21 The implementation of strict lockdown policies, including the restriction in interregional movements, has likely contributed to reducing the spread of the infection and the consequent excess mortality in other regions, particularly in the South. This aspect can also explain the patchy map of risk across the country, where hotspots of risk can be the result of the influx of infections from high-prevalence provinces of the North before strict lockdown policies where implemented. Other suggested reasons for the geographical distribution of excess mortality are the occurrence of massgathering events that boosted local numbers of infections 22 and regional differences in health-service organization, socio-economic determinants and urbanization. [23][24][25] The assessment of results of the stratified analysis offers interesting clues on sex and age differentials. Whereas this study confirms the higher risk in men, the difference is comparatively small, with a percentage increase of 32.1% vs 27.7% in women. The analysis by age in Figure 2 reveals that the highest excess occurred among the 70-79 year-olds, but was lower in the younger and older groups. The risk is even lower but still present in the rest of the population, with evidence of excess deaths even in people <60 years old. The graph in Figure 3 shows instead some differences in the timing of the excess mortality across sex, age groups and area. The more delayed peak in women and the very elderly can be indicative of differential risk pathways, for instance related to the reduction in access to emergency services during the lockdown or the delayed surge of infections in nursing care homes. The early decrease and then below-average mortality in Central and Southern Italy can instead imply that some of the excess is due to anticipation of death in frail individuals. These suggestive results can form the basis of future analyses.
The figures reported in this study can be compared with those previously published in peer-reviewed articles, although the comparison is made difficult by the different study population, study period, data-collection procedures and analytical methods. Michelozzi and colleagues analysed data collected within a surveillance system from 19 Italian cities, reporting a total excess mortality of 45% from the beginning of March until 18 April 2020. 9 The increase was higher in northern cities, in men and in the elderly, although with estimates that are somewhat different from the present analysis, in part probably due to the focus on urban areas and the different period. Magnani and colleagues used a previous release of the data by ISTAT on 4433 municipalities covering the whole Italian territory. They extrapolated the excess to the total Italian population, reporting an expected excess of 44 352.5 deaths in the group of people >60 years old and 680.4 deaths in younger people, between 1 March and 15 April. 10 They, however, reported negative estimates in several regions, in particular for the younger group. These differences can be due to the simple pre-post comparison adopted in these studies, which does not account for temporal trends and for the milder weather conditions in 2020.
This study benefits from the application of an advanced two-stage design and statistical models that allow a flexible estimation of the excess mortality risk and a definition of the baseline risk that considers temporal trends and variations in known risk factors. The analysis is performed at the province level and stratified by sex, age and week periods, thus offering a fine characterization of the impact of the COVID-19 pandemic in Italy. Data and code are available, making the analysis entirely reproducible, and the full set of results is shown in the appendix and through a web application. Some limitations must, however, be acknowledged. First, the low number of cases prevents the full application of the two-stage modelling in age groups <50 years old and therefore the ability to address the question about to what extent COVID-19 affects young people. Furthermore, the interrupted time-series design defines a simple pre-post comparison that cannot provide information about the various mechanisms leading to increased risks. The question about controlling or not for the incidence of seasonal influenza is emblematic with respect to this problem. Similarly, whereas the analysis of all-cause mortality can provide a good estimate of the overall burden, it offers no information about the various mechanisms, patho-physiological or of different nature, leading to increased deaths. Finally, the analysis is based on an early-release data set from ISTAT that has not been consolidated and this may lead to an underestimation of the number of deaths that occurred in 2020 and therefore of the excess during the pandemic. These issues can be addressed when additional data and information on the COVID-19 pandemic are released.
In conclusion, this study offers a detailed picture of excess mortality during the first months of the COVID-19 pandemic in Italy. The analysis indicates that the excess mortality during the outbreak has been far higher than the number of deaths officially reported as related to COVID-19, with strong geographical differences and patterns of risk that vary by sex, age and period. The figures can be used in future research to explain differential risks across regions and subgroups, and the role of national or local policies implemented during the period.

Supplementary data
Supplementary data are available at IJE online.

Data availability
The data, software and code for replicating the analysis and full set of results are made available. The analysis was performed in the R software environment. The scripts for downloading the data and for fully replicating the analysis and results are available in a GitHub repository (https://github.com/gasparrini/ItalyCOVIDdeath). The data (released on 18 June 2020 with updated mortality up to 15 May) were extracted from the ISTAT website (https://www.istat.it/it/archi vio/240401) and stored in the repository. A web application has been made available at https://mscortichini.shinyapps.io/app20200703/ to visualize maps of the excess risk and extract the full set of results, which are also available in the Supplementary Data, available as Supplementary data at IJE online, of this article.