Association between work sick-leave absenteeism and SARS-CoV-2 notifications in the Netherlands during the COVID-19 epidemic

Abstract Background Alternative data sources for surveillance have gained importance in maintaining coronavirus disease 2019 (COVID-19) situational awareness as nationwide testing has drastically decreased. Therefore, we explored whether rates of sick-leave from work are associated with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) notification trends and at which lag, to indicate the usefulness of sick-leave data for COVID-19 surveillance. Methods We explored trends during the COVID-19 epidemic of weekly sick-leave rates and SARS-CoV-2 notification rates from 1 June 2020 to 10 April 2022. Separate time series were inspected visually. Then, Spearman correlation coefficients were calculated at different lag and lead times of zero to four weeks between sick-leave and SARS-CoV-2 notification rates. We distinguished between four SARS-CoV-2 variant periods, two labour sectors and overall, and all-cause sick-leave versus COVID-19-specific sick-leave. Results The correlation coefficients between weekly all-cause sick-leave and SARS-CoV-2 notification rate at optimal lags were between 0.58 and 0.93, varying by the variant period and sector (overall: 0.83, lag −1; 95% CI [0.76, 0.88]). COVID-19-specific sick-leave correlations were higher than all-cause sick-leave correlations. Correlations were slightly lower in healthcare and education than overall. The highest correlations were mostly at lag −2 and −1 for all-cause sick-leave, meaning that sick-leave preceded SARS-CoV-2 notifications. Correlations were highest mostly at lag zero for COVID-19-specific sick-leave (coinciding with SARS-CoV-2 notifications). Conclusion All-cause sick-leave might offer an earlier indication and evolution of trends in SARS-CoV-2 rates, especially when testing is less available. Sick-leave data may complement COVID-19 and other infectious disease surveillance systems as a syndromic data source.


E
xtensive monitoring of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) activity was in place in the Netherlands since the first infection was detected on 27 February 2020.This monitoring system included traditional epidemiological counts such as SARS-CoV-2 positive laboratory diagnoses, hospitalizations, ICU admissions and less traditional markers such as virus particle concentration in sewage water and community self-reported symptoms by web application ('Infectieradar'). 1,2Large scale testing was mostly done at Public Health Services (PHS) coronavirus disease 2019 (COVID-19) test facilities free of charge, in persons with symptoms (as of 1 June 2020), those who had been in contact with a SARS-CoV-2 positive person irrespective of symptoms (as of 1 December 2020), and after a positive self-test (as of 3 February 2021).Over time the willingness to test at an official testing location decreased, 3 which reduces the value of the data for surveillance and likely biases it towards certain risk groups.Since 10 April 2022, confirmation of a positive self-test was no longer required and infection notifications with detailed additional information were replaced by notifications with basic information (sex, date of birth, zip code and date of test result).Due to the reduction in information per notification and the termination of mandatory reporting per 1 July 2023, non-traditional surveillance methods will become more prominent as a marker for spread of mild disease in the general population.
A possibly viable data source is work absenteeism, specifically sickleave.Advantages of using sick-leave data for surveillance could be timeliness, availability, and versatility to use it for monitoring of many infectious diseases.COVID-19 surveillance was relatively timely based on detected infections in the community.Given that the notification of SARS-CoV-2 infections is no longer mandatory it might take longer to detect changes in SARS-CoV-2 activity, through the monitoring of for example hospitalizations.Hospitalizations represent severe cases and are monitored, but are not mandatory notifiable.Such hospital admissions lag behind the symptom onset. 4Sick-leave data are collected for business purposes, and this registration will be maintained even when SARS-CoV-2 testing decreases.][11][12][13][14] Sick-leave data also provide not previously used information in infectious disease surveillance in the Netherlands.In addition to its potential to reflect disease trends during an epidemic and to increase situational awareness, sick-leave trends could be used to estimate disease trends at the start of an epidemic, when laboratory testing for a new pathogen is not yet available.
To date, few studies have been published on the use of sick-leave data as a surveillance tool for infectious diseases.To understand the trend and timing of sick-leave relative to SARS-CoV-2 trends in the Netherlands we compared weekly sick-leave to SARS-CoV-2 notification rates during the COVID-19 epidemic.

Design and setting
For this descriptive, ecological analysis, we used sick-leave data and notifications of laboratory-confirmed SARS-CoV-2 infections from the national COVID-19 surveillance database in the period of 1 June 2020 to 10 April 2022.The first months of the COVID-19 epidemic, March-May 2020, were excluded as laboratory testing capacity was limited.

Sick-leave data
Anonymized sick-leave data were made available by Human Total Care (HTC), a nationwide Dutch occupational health service. 15TC provides guidance and support to both employees and their employers during sick-leave and in the return to work.For these services, the employers report the sick-leaves of their employees to HTC.HTC does not receive data from their contracted employers on non-illness absenteeism of employees, such as holidays or care leave.Depending on the employer contract, some employees receive an additional triage questionnaire at the start of the sick-leave.In this questionnaire the reason for sick-leave is reported by the employee (Supplementary file S1).During the COVID-19 epidemic additional COVID-19 triage questions were added to this questionnaire.HTC covers approximately 11% of the Dutch working population as defined by Statistics Netherlands. 15e used the date of first day of sick-leave (without illness duration), work sector and absence cause.We categorized healthcare and education employees according to the Dutch Standard Industrial Classification. 16Healthcare was of special interest due to the additional pressure of sick-leave on the sector and education because of the societal impact of class and school closures that would result from sickleave.All-cause sick-leave (AC-sick-leave) was defined as all sick-leave, while COVID-19-specific sick-leave (CS-sick-leave) was defined as reports with a direct or indirect laboratory-confirmed SARS-CoV-2 infection as the sick-leave reason (questionnaire subset).Weekly sickleave rates per 100 000 persons were calculated by dividing counts by the coverage data (i.e. the total number of employees of the employers contracting HTC services).

COVID-19 surveillance data
SARS-CoV-2 notifications were defined as laboratory-confirmed positive SARS-CoV-2 tests.Tests were performed by (non-)commercial test facilities or healthcare facilities.Mandatory reporting of these infections was done to the regional PHS in the Netherlands and reported via OSIRIS (Online System for Infectious disease Reporting) to the National Institute of Public Health and the Environment (RIVM). 1 We based SARS-CoV-2 notification on the date of positive test result and selected aged 18-66 to best match the working population, as 66 is the retirement age in the Netherlands. 17Persons reporting unemployment were excluded.Reports with unknown employment were included as they constitute a mix of employed and unemployed persons with missing data (less detailed employment registrations with time; Supplementary file S2) and groups such as students.Because of the before-mentioned societal interest in healthcare and education sectors, SARS-CoV-2 infected people were specifically asked whether they were employed in either profession.Consequently, availability of employment data was high for these sectors in OSIRIS.Persons working in '(health)care in hospital', '(health)care in nursing home/residential care facility for the elderly', '(health)care in other institution with 24-hour care', 'home (health)care' or 'other (health)care' were classified as healthcare workers.Those working in 'day-care', 'primary school or after-school care', 'secondary education', 'vocational college' and 'higher education' were classified as education workers.Weekly SARS-CoV-2 notification rates per 100 000 persons were calculated using quarterly labour force size data from Statistics Netherlands. 18

Data exploration
Data were explored using descriptive statistics and visual inspection of time series graphs; overall and separately for healthcare and education sectors.
To derive plausible correlations during the Christmas holidays in these retrospective analyses, due to potentially different lags per period, we replaced the observed sick-leave rates during Christmas holidays with the mean of the sick-leave rates in the one to two weeks prior to and after the holiday for all categories of sick-leave (as not to dilute correlations).Employees that are on holidays are less likely to report sick-leave, but are still fully counted in the coverage data (personal communication HTC).This type of interpolation was also previously done by others. 19Additionally, for education, this was also done for all other official one-or two-week school holidays. 20This was not done for the six-week summer holiday, as we assumed these interpolations less reliable for longer time periods.For one-week holidays, we interpolated the sick-leave rates as the mean sick-leave rates of the week before and after the holiday week.For two-week holidays, the first week was interpolated using the mean of the two weeks before and one week after the holiday; the second week was interpolated using the mean of one week before and two weeks after the holiday.The middle week of the extended three-week long Christmas holiday in 2021 was interpolated using the mean of the single week directly before and after the holiday.

Statistical analysis
We calculated the Spearman's rank correlation coefficients between the weekly sick-leave by first date of absence and SARS-CoV-2 notification rates by date of positive test result.To gain insight into the timing of the two time series relative to each other we also calculated the correlation coefficients at different time lags (zero to four weeks).Positive lags indicate that sick-leave occurred after SARS-CoV-2 notifications, while negative lags indicate that sick-leave preceded SARS-CoV-2 notifications.We categorized correlation correlations by four time periods based on dominant SARS-CoV-2 variants, reflecting variances in infectiousness, disease severity and policy changes.These periods were defined compliant with periods previously defined by the RIVM, which considered variant dominance, hospital admissions and evolving factors like policy changes and vaccination rates. 21Wildtype: 01 June 2020 until 31 June 2021
The average weekly AC-sick-leave rate was higher than SARS-CoV-2 weekly notification rate during all periods except the omicron period (Supplementary file S3).
The age distribution of sick-leave reports had two peaks around ages 30 and 50, while the age distribution of the SARS-CoV-2 notifications showed a higher proportion of young people (Supplementary files S4-5).

Overall time series and correlations
Visual inspection of the overall time series (all work sectors combined) showed some coherence between AC-sick-leave and SARS-CoV-2 weekly rates, seemingly more pronounced during the delta and omicron periods (figure 1A).Changes in AC-sick-leave and SARS-CoV-2 rates were often concurrent: the first two SARS-CoV-2 peaks in 2020 roughly coincided with increases in AC-sickleave and the four SARS-CoV-2 peaks during the delta and omicron periods roughly coincided with more pronounced peaks in AC-sickleave.In August 2020 and in August/September 2021, AC-sick-leave increased before the SARS-CoV-2 rate did.This is also reflected by the overall correlation being the highest (0.83, 95% CI [0.76, 0.88]) at lag −1, which means that AC-sick-leave preceded SARS-CoV-2 notifications by one week (table 2).Correlation coefficient size and optimal lag varied by variant period.The wildtype (0.87, 95% CI [0.76, 0.93]) and delta (0.82, 95% CI [0.63, 0.92]) periods showed higher correlations than alpha (0.72, 95% CI [0.42, 0.87]) and omicron (0.66, 95% CI [0.19, 0.88]).Optimal lags were negative for all periods except the omicron period (þ1).
Overall CS-sick-leave rates followed an almost identical trend to the SARS-CoV-2 rate (figure 2A).In August 2020, the rise in CSsick-leave visually preceded the rise in SARS-CoV-2 rates.During the rest of the study period, the time series seemingly coincided, as reflected by the correlations being highest at lag zero.This indicates no delay between CS-sick-leave and SARS-CoV-2 rates.CS-sickleave showed higher correlation with the SARS-CoV-2 rate than AC-sick-leave in all periods (table 2).While CS-sick-leave peaks were very pronounced during the omicron period, the SARS-CoV-2 rate was even more pronounced.

Healthcare
The time series for the healthcare sector resembled the overall time series (figure 1B) at the same optimal lag of −1 but with lower correlations (0.71, 95% CI [0.59, 0.79], table 2).
The correlations between the CS-sick-leave and SARS-CoV-2 rates were similar to the correlations between AC-sick-leave and SARS-CoV-2 (figure 2B), with a higher correlation for all periods except delta.The optimal lags were zero (table 2).

Education
For the education sector, AC-sick-leave and SARS-CoV-2 weekly rates trends coincided for shorter durations and not consistently over the study period (figure 1C).This was also reflected by the lower correlation (0.61, 95% CI [0.47, 0.72]) than overall and in healthcare.The optimal lag was −2 for the total study period (table 2).
The trend and peaks of weekly CS-sick-leave and SARS-CoV-2 rates were more similar to each other during all periods than AC-sick-leave was (figure 2C).The optimal lags were either zero or þ1 (table 2).

Discussion
This study shows an association between weekly sick-leave and SARS-CoV-2 infection notification rates and thus gives an indication that sick-leave data are potentially a worthwhile additional data source for strengthening COVID-19 surveillance.This association also indicates that sick-leave data may be useful as an additional surveillance source for other respiratory infectious diseases as well as for pandemic preparedness.
Trends between sick-leave and SARS-CoV-2 notification rates were relatively similar and the correlations between the two were moderate to high, varying by SARS-CoV-2 variant.
AC-sick-leave, while showing a lower association with SARS-CoV-2 than CS-sick-leave, seems a more timely indicator of ensuing SARS-CoV-2 increases.AC-sick-leave data are also more widely available than CS-sick-leave data.Data from healthcare and education sectors did not provide increased timeliness nor stronger correlations.
In all periods except omicron, changes in AC-sick-leave preceded changes in SARS-CoV-2 notifications as judged from the optimal lag of the correlations.Sick-leave reports, which can be registered before individuals get tested or before waiting to receive test results, are likely to precede SARS-CoV-2 notifications.This result is in agreement with the findings by G omez et al. that showed sick-leave among nursing home employees was more timely than COVID-19 cases. 7ur results suggest that the less specific AC-sick-leave may be timelier than the more specific CS-sick-leave.One explanation is that at the start of a wave, less employees attribute their illness to COVID-19 and therefore do not get tested.This explanation is supported by behavioural research which indicates that per wave, the willingness to test increased as more SARS-CoV-2 infections were identified within the wave. 22n explanation for CS-sick-leave showing a higher correlation with SARS-CoV-2 notifications is that the data are highly linked: 69.9% of CS-sick-leave was registered as due to a positive SARS-CoV-2 test.Additionally, for the remaining group suspecting COVID-19, this was based on a close contact's positive test or awaiting their test result.
While disease specific sick-leave provides more precise information on disease spread, this requires laboratory testing, complicating the consistent acquisition of this information.CS-sick-leave data Association between work sick-leave absenteeism and SARS-CoV-2 notifications 499 were less abundant than AC-sick-leave data: the sick-leave symptom or cause was available for only 11.9% of the study population, and 28% of that subset reported suspected or confirmed SARS-CoV-2.More importantly, the COVID-19 questionnaire has been discontinued since 01 April 2023.
With few reports on the potential timeliness of work sick-leave data for infectious disease surveillance, our study is the first to explore the lag time between the weekly sick-leave and SARS-CoV-2 rates.However, while we showed the results at the optimal lags, nearby lags often showed similar correlation coefficients.Therefore, the optimal lag only provides an indication of the timeliness of one time series relative to another.
Our results suggest that the degree of association between sick-leave and SARS-CoV-2 trends differs between SARS-CoV-2 variants.This disparity could be due to differences in virus characteristics, vaccine uptake and natural immunity in the population.a: Lag in weeks; the optimal lag being the lag with the highest correlation coefficient (negative lags: sick-leave in the weeks preceding SARS-CoV-2 weekly notification rate, positive lags: sickleave in the weeks after SARS-CoV-2 weekly notification rate).

Association between work sick-leave absenteeism and SARS-CoV-2 notifications 501
The ratio of average CS-sick-leave to SARS-CoV-2 notification rates was lower for every new variant in the entire study population, especially during the omicron period.4][25] Vaccine coverage and naturally acquired population immunity also impacts the difference between periods.It has been shown that vaccination for SARS-CoV-2 has led to milder infections and less sick-leave in healthcare personnel. 26,27Vaccine coverage was mainly a factor in the delta and omicron periods and this may have impacted the correlation between sick-leave and SARS-CoV-2 rates.For example, a slight increase in sick-leaves due to asymptomatic confirmed SARS-CoV-2 infections was found in fully vaccinated healthcare personnel compared to partially/non-vaccinated personnel. 27As asymptomatic infections are harder to detect, vaccinated employees may provide less insight into the spread of SARS-CoV-2.

Figure 1
Figure 1 All-cause (AC-)sick-leave and SARS-CoV-2 weekly rates per 100 000.Red dashed line: AC-sickleave is interpolated during Christmas holidays (all short holidays in the education sector).White and grey shading: the periods of SARS-CoV-2 variants are shown with alternating shaded planes (wildtype, alpha, delta, omicron).A. Overall, B. Healthcare sector and C. Education sector

Figure 2
Figure 2 COVID-specific (CS-)sick-leave and SARS-CoV-2 weekly rates per 100 000.Yellow dashed line: CS-sick-leave is interpolated during Christmas holidays (all short holidays in the education sector).White and grey shading: the periods of SARS-CoV-2 variants are shown with alternating shaded planes (wildtype, alpha, delta, omicron).A. Overall, B. Healthcare sector and C. Education sector

Table 1
Total number of registered sick-leave and SARS-CoV-2 notifications (aged 18-66) per labour sector during the study period a

Table 2
Highest Spearman correlation coefficients between sick-leave and SARS-CoV-2 weekly rate per 100 000 at optimal lags a during the study period