Filling the gaps in the characterization of the clinical management of COVID-19: 30-day hospital admission and fatality rates in a cohort of 118 150 cases diagnosed in outpatient settings in Spain

Abstract Background Currently, there is a missing link in the natural history of COVID-19, from first (usually milder) symptoms to hospitalization and/or death. To fill in this gap, we characterized COVID-19 patients at the time at which they were diagnosed in outpatient settings and estimated 30-day hospital admission and fatality rates. Methods This was a population-based cohort study.   Data were obtained from Information System for Research in Primary Care (SIDIAP)—a primary-care records database covering >6 million people (>80% of the population of Catalonia), linked to COVID-19 reverse transcriptase polymerase chain reaction (RT-PCR) tests and hospital emergency, inpatient and mortality registers. We included all patients in the database who were ≥15 years old and diagnosed with COVID-19 in outpatient settings between 15 March and 24 April 2020 (10 April for outcome studies). Baseline characteristics included socio-demographics, co-morbidity and previous drug use at the time of diagnosis, and polymerase chain reaction (PCR) testing and results.   Study outcomes included 30-day hospitalization for COVID-19 and all-cause fatality. Results We identified 118 150 and 95 467 COVID-19 patients for characterization and outcome studies, respectively. Most were women (58.7%) and young-to-middle-aged (e.g. 21.1% were 45–54 years old). Of the 44 575 who were tested with PCR, 32 723 (73.4%) tested positive. In the month after diagnosis, 14.8% (14.6–15.0) were hospitalized, with a greater proportion of men and older people, peaking at age 75–84 years. Thirty-day fatality was 3.5% (95% confidence interval: 3.4% to 3.6%), higher in men, increasing with age and highest in those residing in nursing homes [24.5% (23.4% to 25.6%)]. Conclusion COVID-19 infections were widespread in the community, including all age–sex strata. However, severe forms of the disease clustered in older men and nursing-home residents. Although initially managed in outpatient settings, 15% of cases required hospitalization and 4% died within a month of first symptoms. These data are instrumental for designing deconfinement strategies and will inform healthcare planning and hospital-bed allocation in current and future COVID-19 outbreaks.

Introduction COVID-19 started as an outbreak in Wuhan, China, in December 2019 and rapidly developed into a global pandemic, causing a substantial morbidity and fatality burden, and straining healthcare systems worldwide. 1 In Europe, it was first reported in late January 2020. The first case in Spain was reported a month later, although one study has suggested that community transmission were already occurring by then. 2 By 26 July, Spain had reported the eighth-highest death toll of COVID-19 in the world.
As most COVID-19 patients present influenza-like symptoms, including fever, dry cough, fatigue and sore throat, 3 they are eligible for outpatient or primary-care management in the first instance. However, studies to date have focused on the characteristics and prognosis of hospitalized 4 or intensive-care 5 COVID-19 patients, skewing current estimates of the morbi-mortality of COVID-19 globally. It is essential to characterize patients from their first diagnosis to achieve a more complete understanding of the prognosis of this disease.
Universal healthcare systems, such as those in Spain and the UK, rely on general practitioners to act as gatekeepers, with all patients seen in primary care before admission to hospital. 6 Primary-care electronic health records from such healthcare systems, when linked with hospital emergency and inpatient and mortality registers, offer a unique opportunity to fully characterize the natural history of COVID-19.
We used outpatient and inpatient electronic medical records for a large number of COVID-19 cases, linked with reverse transcriptase polymerase chain reaction (RT-PCR) data and mortality registers, to describe the natural history of COVID-19 in Spain. We characterized COVID-19 patients' socio-demographics, co-morbidities and medicines used at the time of diagnosis. We then estimated the need for hospital admissions and all-cause fatality associated with COVID-19 in the month following from first symptoms and outpatient diagnosis.

Study design and data sources
We performed a cohort study with prospectively collected data from the Information System for Research in Primary Care (SIDIAP; www.sidiap.org) in Catalonia, Spain. 7 SIDIAP contains anonymized primary-care electronic health records for >6 million people, covering a representative >80% of the Catalan population since 2006. It includes high-quality, validated diagnoses [International Classification for Diseases, 10th revision, Clinical Modification (ICD-10-CM)], medicine prescriptions, laboratory tests, and lifestyle and socio-demographic information. 8,9 For this study, SIDIAP was also linked to the region-wide population-based hospital and outpatient emergency register, 10 the bespoke central database of RT-PCR COVID-19 tests and the regional mortality registry.

Study participants and follow-up
We included all individuals aged 15 years with COVID-19 identified by a positive RT-PCR test for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and/or a clinical diagnosis (ICD-10-CM codes B34.2, B97.21, B97.29, J12.81) recorded in primary care from 15 March 2020 to 24 April 2020. We excluded individuals who were hospitalized with COVID-19 before their index date (prevalent cases). Patients with a prevalent diagnosis of pneumonia and/or a hospital admission for respiratory symptoms subsequently diagnosed with COVID-19 were also excluded to avoid misclassification. For outcome analysis (hospital admission or fatality), only those with an index date before 10 April 2020 were included to guarantee at least 14 days of follow-up from the index date.
Participants were followed from the earliest of a first positive RT-PCR test or clinical diagnosis (index date) until death or the end of the study period (24 April 2020). Repeat RT-PCR testing was dismissed.

Baseline characteristics and co-morbidities
Socio-demographics were assessed at the index date: sex, age (in years), place of residence (community, nursing home), rurality (rural, urban) and nationality (Spain, other). Rural areas were defined as areas with <10 000 inhabitants and a population density of <150 inhabitants/km 2 . We assessed socio-economic status using the validated MEDEA deprivation index, which has previously been linked to SIDIAP 11 and is calculated by the census tract level in urban areas, categorized in quartiles, where the first and fourth quartiles are the least and most deprived areas, respectively. Rural areas were categorized separately.
We defined pre-existing co-morbidities as the presence of a diagnosis code recorded at any time before the index date and still active at COVID-19 diagnosis for a prespecified list of conditions based on previous literature. [3][4][5] Lists of ICD-10-CM codes for each of these conditions are provided in Supplementary Table 1, available as Supplementary data at IJE online.
We characterized use of long-term medicines based on primary-care prescriptions active at the time of diagnosis (index date). A pre-specified list of medicines was created based on the same previous literature [3][4][5] and identified using Anatomical Therapeutic Chemical Classification System codes (Supplementary Table 2, available as Supplementary data at IJE online).

Outcomes
Two outcomes were studied, both 30 days after index date: COVID-19-related hospital admission and all-cause death. Hospitalizations were ascertained from linked hospital data covering emergency rooms and inpatient administrative data for the whole of Catalonia. COVID-19-related admissions were identified using a bespoke list of ICD-10-CM diagnostic codes recorded in hospital-discharge data (Supplementary Table 3, available as Supplementary data at IJE online). Date of death was obtained from linked regional mortality data.

Statistical methods
For descriptive analyses, the median (inter-quartile range) or number (%) is reported for each patient characteristic. The cumulative number (%) of hospital admissions and fatality were obtained from the data. Thirty-day probabilities were obtained from Kaplan-Meier estimates, stratified by sex, age (10-year bands), nursing-home-residence status and RT-PCR result, where available. The Kaplan-Meier estimates and log-rank p-values are shown for both study outcomes, stratified by sex and age group. All analyses were conducted using R version 3.5.1.

Ethics approval
This study was approved by the Clinical Research Ethics Committee of the IDIAPJGol (project code: 20/070-PCV). There was no patient or public involvement in this study.
In the month after diagnosis, 14 141 of the 95 467 patients were hospitalized for complications of COVID-19, equivalent to a cumulative incidence of 14.8% (14.6% to 15.0%). The incidence was higher among men [19.4% (19.0% to 19.8%)] than among women (p < 0.0001) ( Figure 2A) and peaked at age 75-84 years old, with a striking 40.5% (39.3% to 41.7%) admitted within a month of diagnosis (p < 0.0001) ( Figure 2B).   Data are for patients with a positive result on a reverse transcriptase polymerase chain reaction (RT-PCR) test for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and/or a clinical diagnosis recorded in primary care from 15 March to 24 April 2020.
Outcomes were dramatically different for those residing in nursing homes. Although their hospital-admissions rate was only 1.5% points higher than the whole population's average, at 16.1% (15.1% to 17.0%), their 30-day all-cause fatality was >6-fold higher, at 24.5% (23.4% to 25.6%).

Summary of key findings
To our knowledge, this is the first study of the characteristics and key health outcomes of COVID-19 among patients followed from first diagnosis in outpatient and primary-care settings. The existence of a universal tax-funded healthcare system with fully implemented electronic medical records covering out-and inpatient settings, linked to region-wide PCR testing and mortality registers, allowed us to fill a gap in the knowledge of the natural history of COVID-19.
We identified 118 150 people diagnosed with COVID-19 between 15 March and 24 April 2020 in Catalonia, Spain. Most of these people were managed in primary-care and outpatient settings, and only <40% were formally tested for the SARS-CoV-2 virus. This diagnosis figure was more than double the official figure of 48 916 cases in Catalonia by 30 April 2020, based on RT-PCR-confirmed cases (https://covid19.isciii.es/). A much higher proportion of the population may therefore have been affected, with different levels of disease severity, and community transmission may have been much wider than previously estimated from official figures.

Findings in context
COVID-19 was spread throughout the studied community, with almost 60% of cases in women and most cases seen in young to middle-aged adults aged 35-65 years old. Infections were seen in all socio-economic strata in urban environments, and almost one in five patients lived in rural areas. However, more severe forms of COVID-19 disease were clustered in men and elderly people, who accounted for most of the hospital admissions and deaths. This result is in line with those from studies of hospitalized COVID-19 populations in the US, 12 China 4 and Europe, 13 and suggests that age and sex are key risk factors for poor outcomes among those infected with COVID-19.
Chronic non-communicable diseases were common among patients with COVID-19. Probably related to this, >1 in 10 of the affected patients used medicines such as analgesics, proton-pump inhibitors/anti-acids, statins and anti-hypertensives such as ACEi/ARBs. Similar figures have been published elsewhere. Early reports from inpatient cases in China reported that 7.4% of COVID-19 patients had diabetes and 15% hypertension. 3 More recently, an international study of 6806 patients hospitalized with COVID-19 in the US and South Korea 14 found geographical variation in the prevalence of chronic comorbidities. For example, 17.9% of COVID-19 patients in South Korea and 43.3% of US veterans with COVID-19 had diabetes, whereas 21.8% and 69.7% of these patients had hypertension. Patients in our data set who were treated as outpatients appeared somewhat healthier than the US and South Korean inpatient cases, but had a higher prevalence of co-morbidities than those reported from China: 9.75% had diabetes and 24.26% had hypertension. Our data and those obtained from previous studies suggest that the profile of patients with COVID-19 varies geographically, with no consistent differences seen between outpatient and inpatient cases.
Almost 15% of the patients with COVID-19 in our study were admitted to hospital in the month after their diagnosis in primary-care/outpatient settings. A preprint of a recent study compared hospital admissions for influenza and COVID-19, finding that COVID-19-admitted patients were generally younger and healthier than those admitted for influenza. 14 Even though our sample included milder forms of COVID-19, 30-day fatality was still high at 4% on average and almost 5% in men. For comparison, estimated fatality rates for seasonal influenza ranged between 0.2% and 0.35% for the previous 13 influenza seasons in Catalonia. 15 Previous vaccination 16 and different age distributions are obvious contributors to influenza's 10-fold lower fatality rate.
Fatality was much higher (about one in four) in older populations and nursing-home residents than the full study sample. A study of 101 COVID-19 residents of long-term care facilities in the US reported a fatality rate of 33.7%. 17 The spread and severity of COVID-19 shown in this population reflect the recognized vulnerability of nursinghome residents to respiratory infections and airborne epidemics, including previous coronavirus 18 and influenza outbreaks and pneumonia. 19,20 COVID-19 testing has been limited in primary-care settings, even among symptomatic patients. Less than 40% of the identified cases in our sample had received an RT-PCR test, probably representing the more severe cases. A positive test result was a clear marker of prognosis, with 43% subsequently hospitalized and >6% dying within the next 30 days, compared with 15% and <1%, respectively, among clinically diagnosed cases who tested negative for COVID-19. Despite WHO recommendations, there are currently huge disparities across countries in the implementation of testing. 21 Our study demonstrates the value of testing for planning healthcare-resource allocation, in addition to informing public-health decision-making. With the emergence of effective anti-COVID-19 therapies in the coming months, further testing will probably be needed for differential diagnosis. 22,23 Strengths and weaknesses of the study Our study has limitations. Our large, routinely collected data set may have included misclassified cases, as other influenza-like illnesses or respiratory conditions could have been diagnosed as COVID-19 in the context of the current pandemic. In the subsample of patients with RT-PCR data available, almost three in four cases were positive, suggesting that the positive predictive value of outpatient diagnosis in our context approached 75%. As the other one in four patients had better health outcomes (fewer hospitalizations and lower fatality), it is likely that any misclassification led to underestimated risks of complications, admissions and fatality related to COVID-19 infection in our study.
Although we included milder forms of COVID-19 than previous characterization studies by including primarycare diagnoses, our sample will have missed most asymptomatic cases, as testing was not widespread in those without symptoms. The sample also likely missed many of those with mild symptoms, as, following official advice to stay home and self-isolate to avoid contact and spread in healthcare facilities, they may not have ever come into contact with their primary-care practice or a hospital to receive an official diagnosis.
Our study's strengths lie in our comprehensive data set. The use of routinely collected data allowed a large sample size. SIDIAP is a well-validated data source 9 that has been used in many previous studies. 24 By using primary-care records linked to comprehensive region-wide hospital, mortality and testing registers obtained from a universal tax-funded healthcare system, we were able to completely characterize the natural history of COVID-19 infection from symptom onset, while avoiding the recall bias and loss to follow-up typical of cohort studies.
Including COVID-19 cases treated exclusively in outpatient settings allowed the study of subpopulations that are less likely to be admitted to hospital and are known to be more susceptible to infection. For instance, we identified almost 11 000 nursing-home residents and >9000 people with a history of cancer who were diagnosed with COVID-19, and almost 600 women diagnosed with COVID-19 during pregnancy. Our study data set thus included the largest numbers of COVID-19 cases among these populations yet recorded. For comparison, previous studies of COVID-19 patients with prevalent cancer have included <30 participants. 25,26 This data set will be an invaluable resource for studying the effects of SARS-CoV-2 in these populations. Finally, linkage to test data and to comprehensive region-wide hospital and mortality registers allowed us to track COVID-19 infections through a universal healthcare system, whilst avoiding recall bias and loss to follow-up typical of cohort studies.

Conclusion
This is the first study to date on the characteristics, hospital admissions and fatality associated with COVID-19 disease diagnosed in outpatient settings. COVID-19 is often diagnosed and initially managed in outpatient clinics, with limited testing leading to underreported cases in official figures and overestimated fatality. Notwithstanding this, COVID-19 in our wide sample led to hospitalization in 15% of diagnosed patients and a 30-day fatality of 4%. Our data suggest that, although COVID-19 infection has spread throughout the Catalonia, Spain community, most often in young and middle-aged women, severe forms of the disease and all-cause fatality cluster in older men and among nursing-home residents. This information is of key relevance for healthcare professionals, public-health authorities and commissioners, now and in future COVID-19 outbreaks.

Supplementary data
Supplementary data are available at IJE online.

Author contributions
All authors contributed to the design of the study and interpretation of the results, and reviewed the manuscript. E.C. and N.M. had access to the data, performed the statistical analysis and acted as guarantors. D.P.A., E.C., N.M., A.P.-U. and T.D.S. wrote the first draft of the manuscript. All authors critically revised the manuscript. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.
Funding primary-care to intensive-care units in the Catalan healthcare system.
Queremos reconocer y recordar a todos los pacientes que han sufrido y muerto por la COVID-19. Queremos también agradecer a todos los profesionales sanitarios que han diagnosticado y tratado esta enfermedad en el sistema catalá n de salud y en España, desde los centros de salud hasta las unidades de cuidados intensivos.
We acknowledge English-language editing by Dr Jennifer A. de Beyer of the Centre for Statistics in Medicine, University of Oxford. In accordance with current European and national law, the data used in this study are only available for the researchers participating in this study. Thus, we are not allowed to distribute or make publicly available the data to other parties. However, researchers from public institutions can request data from SIDIAP if they comply with certain requirements. Further information is available online (https://www.sidiap.org/index.php/menu-solicitudes-en/applicationproccedure) or by contacting Anna Moleras (amoleras@idiapjgol. org).