Social vulnerability and rurality associated with higher SARS-CoV-2 infection-induced seroprevalence: a nationwide blood donor study, United States, July 2020 – June 2021

Abstract Background Most studies on health disparities during COVID-19 pandemic focused on reported cases and deaths, which are influenced by testing availability and access to care. This study aimed to examine SARS-CoV-2 antibody seroprevalence in the U.S. and its associations with race/ethnicity, rurality, and social vulnerability over time. Methods This repeated cross-sectional study used data from blood donations in 50 states and Washington, D.C. from July 2020 through June 2021. Donor ZIP codes were matched to counties and linked with Social Vulnerability Index (SVI) and urban-rural classification. SARS-CoV-2 antibody seroprevalences induced by infection and infection-vaccination combined were estimated. Association of infection-induced seropositivity with demographics, rurality, SVI, and its four themes were quantified using multivariate regression models. Findings Weighted seroprevalence differed significantly by race/ethnicity and rurality, and increased with increasing social vulnerability. During the study period, infection-induced seroprevalence increased from 1.6% to 27.2% and 3.7% to 20.0% in rural and urban counties, respectively, while rural counties had lower combined infection- and vaccination-induced seroprevalence (80.0% vs. 88.1%) in June 2021. Infection-induced seropositivity was associated with being Hispanic, non-Hispanic Black, and living in rural or higher socially vulnerable counties, after adjusting for demographic and geographic covariates. Conclusion The findings demonstrated increasing SARS-CoV-2 seroprevalence in the U.S. across all geographic, demographic, and social sectors. The study illustrated disparities by race-ethnicity, rurality, and social vulnerability. The findings identified areas for targeted vaccination strategies and can inform efforts to reduce inequities and prepare for future outbreaks.


INTRODUCTION
Health equity is when all people have the opportunity to attain their full health potential, and no one is disadvantaged from achieving this potential because of social position or other socially determined circumstances. However, members of racial and ethnic minority groups have long experienced health disparities due to inequities in social determinants of health (SDOH), including socioeconomic status (SES) and social and community context [1]. The COVID-19 pandemic has brought longstanding social and racial injustice and health inequity to the forefront of public health.
Compared to non-Hispanic Whites, some racial/ethnic minority groups, such as Hispanic and non-Hispanic Black, have experienced disproportionately higher rates of reported COVID-19 case incidence, associated severe illness, and death [2][3][4]. In addition, rural communities have experienced higher incidence and mortality rates than metropolitan communities beginning in early August 2020 [5].
Although multiple studies have analyzed trends in the race/ethnicity distribution of COVID-19 cases, most have used COVID-19 case data, which do not effectively capture all infections due to a large proportion of mild or asymptomatic cases that are not detected or reported [6]. Additionally, national surveillance data often lacks complete race/ethnicity information. As of July 2021, nearly 40% of reported U.S. COVID-19 cases and 20% of deaths did not have data for race or ethnicity [7].
Serological surveys have been valuable tools used to detect infections of SARS-CoV-2, the virus that causes COVID-19 [8,9]. Previous serosurveys have observed substantial disparities in SARS-CoV-2 infection prevalence among racial and ethnic minority groups [10][11][12], yet samples have been limited to special populations or specific geographic areas (e.g., individual cities or states) in the United States [13,14].
Since July 2020, CDC has been conducting a nationwide seroprevalence study in blood donors [15,16]. We analyzed data from blood donor specimens collected from July 2020 to June 2021. The objectives were: 1) to study the weighted SARS-CoV-2 seroprevalence in the United States A c c e p t e d M a n u s c r i p t 5 over time by demographics, rurality, and social vulnerability as measured by CDC"s Social Vulnerability Index (SVI) [17]; 2) to model the associations between the infection-induced seropositivity and key demographics, rurality, and social vulnerability.

Study Design
Antibody seroprevalence data were collected as part of a nationwide blood donor seroprevalence study, described in detail elsewhere [18]. In brief, data collection began in July 2020.
Catchment areas from 17 blood collection organizations were combined into 65 study regions defined by state and metropolitan borders, representing 50 states and Washington D.C. Each month, approximately 500-4,000 anonymous blood donation specimens from each study region were selected, after excluding donations made specifically to provide COVID-19 convalescent plasma and samples with missing donor demographic data (2.85% excluded for missing race/ethnicity). Donor demographic information collected by the blood centers included age, gender, race/ethnicity, and residential ZIP code. Blood collection organizations were not able to select specimens from specific racial or ethnic populations. To increase representation from racial and ethnic minorities, sample size was increased for regions with larger racial and ethnic minority populations.
The study was approved by CDC as non-research public health surveillance based on anonymization of data and routine consent for blood donation testing that includes use of residual samples for research purposes. The study does not require human-subject research review nor clearance by the Office of Management and Budget and was conducted consistent with applicable federal law 1 and CDC policy.  Chi-square test was conducted to examine demographic differences between this study"s blood donor population and the general U.S. population. Because blood donor demographic characteristics differ from those of the U.S. population, monthly estimation weights were created based on the 2018 American Community Survey [23] estimates for age, gender, race/ethnicity. Furthermore, monthly sets of 50 pseudo-replicate weights were created to compute weighted seroprevalence standard errors. We conducted descriptive analyses on the demographic characteristics for donors and social vulnerability characteristics for all counties where donors were matched. Following a repeated cross-section study design, monthly weighted combined (infection-and vaccine-induced) and infection-induced seroprevalences with 95% confidence intervals (95% CI) were calculated for the entire study population, and stratified by age group, sex, race/ethnicity, region, rurality, SVI, and the four SVI themes.
To visualize the spatial-temporal distributions, we mapped the study area by SVI and infection-induced seroprevalence. For the maps, we estimated weighted county-level infectioninduced seroprevalences over 3-month periods for counties with more than 10 donors. All maps were generated using Esri"s ArcGIS Pro version 2.8.0.
Multivariate logistic regression models were applied to assess the association of infectioninduced seropositivity with factors of interest (race/ethnicity, rurality, and social vulnerability), adjusting for all other available covariates that may be related to seropositivity (age, gender, and region). To track trends in associations, the regression models were applied on monthly data to produce monthly odds ratios (ORs) and 95% CIs. The first model used the overall SVI as a measure M a n u s c r i p t

RESULTS
The number of donor specimens with linked county from 50 U.S. states and Washington D.C.
increased from 115,312 in July 2020 to 131,913 in November 2020, and remained at approximately 133,000 per month since then, totaling 1,555,745 specimens during the study period (Supplemental Table S1). Overall, donors were evenly distributed by sex, regions, and social vulnerability, but were primarily 30-64 years of age (67.5%), non-Hispanic White (86.2%), and residents of urban counties (85.4%) ( Table 1). Compared with the U.S. population aged 16 years and older, more blood donors were aged 50-64 years or non-Hispanic White, and fewer donors resided in socially vulnerable counties ( Table 1).
The study area included 1,990 counties (63.3% of 3,142 counties in 50 states and Washington D.C.) that spanned the spectrum of social vulnerability for the overall SVI ( Figure S1), its 4 themes, and 15 social factors (Supplemental Table S2). The median SVI in study counties was 0.44, slightly lower than that of all U.S. counties (0.5). The spatial-temporal distribution of SVI and infectioninduced seroprevalence are visualized in seroprevalence ( Figure S2) and bivariate seroprevalence-SVI maps ( Figure 1). During the study period, seroprevalence increased significantly across all geographic, demographic, and social sectors ( Table 2).
Before wide-spread vaccination, monthly weighted seroprevalence in the study increased from 3.5% (95% CI: 3.2%-3.9%) in July 2020 to 11.6% (95% CI: 11.3%-11.9%) in December 2020 M a n u s c r i p t 9 ( Figure 2, Table 2). Starting January 2021, largely driven by the country-wide vaccination efforts, the combined seroprevalence increased rapidly, reaching 87.4% in the study population and ranging from 83.1% to 95.3% among age, gender, racial/ethnic, and region groups in June 2021 (Supplemental Table S3). The infection-induced seroprevalence also increased, reaching 20.7% overall in June 2021 (Table 2). There was no significant difference by sex, consistent with other U.S.-based seroprevalence studies [8]. Infection-induced seroprevalence was consistently higher in younger age groups, Hispanic people, and non-Hispanic Black people, although the racial/ethnic differences narrowed over the study period ( Figure 3).
Although seroprevalence in rural counties (1.6%, 95% CI: 1.1%-2.0%) was less than half of   Table S5) found a consistent association between the SES theme and infection-induced seropositivity (ORs highest vs. lowest: 2.9 [1.9-4.4] in July 2020 to 1.5 [1.3-1.7] in June 2021), which was stronger than the associations between overall SVI and seropositivity in the first model.

DISCUSSION
This study found that rural counties and counties with higher social vulnerability had experienced higher burdens of SARS-CoV-2 infection, but with significantly lower combined infection-and vaccine-induced seroprevalence in June 2021, likely because of differing vaccination rates. After adjusting for all available demographic and geographic factors, higher infection-inducted seropositivity was significantly associated with age, racial/ethnic, rurality, and social vulnerability.
Although improving health equity has been an important national goal for decades, health disparities were evident and exacerbated during the COVID-19 pandemic. To reduce health disparities, it is important to understand the social and geographic factors that contribute to differential risk and identify which communities are most in-need of enhanced public health interventions.
The racial and ethnic disparities in infection-induced seropositivity, even after adjusting for all available covariate, were consistent with other studies reporting higher infection rates and case rates among Hispanic and non-Hispanic Black groups [2,4,24]. Furthermore, our model highlighted that the racial-ethnic disparities could not be explained entirely by SVI, its themes, and other key factors (age and geographic differences), indicating additional structural factors that drive racial/ethnic differences in infection rates. While the relative importance of social vulnerability and race/ethnicity appeared to decrease over time, consistent with a previous study analyzing case rates [25], disparities persisted. Infection-induced seroprevalence increased substantially, and the observed decrease in relative risks appears to be related to large increases in counties with lowest SVI and A c c e p t e d M a n u s c r i p t 11 largely non-Hispanic White populations, rather than a decrease in incident infections in racial and ethnic minority populations. Focused research and efforts are needed to understand the underlying drivers of health disparities and optimize public health interventions that address environmental, place-based, occupational, policy and systemic factors that impact health outcomes.
The SVI"s SES theme incorporates measures of income, unemployment, poverty, and education. We found that individuals living in areas with high levels of socioeconomic vulnerability had higher odds of seropositivity across all months of the study, although, similar to race/ethnicity, the size of this association decreased over time. Part of this shift was likely due to the relative increase in seroprevalence in the Midwest over time, which had the lowest levels of socioeconomic vulnerability in our sample. People with lower income are more likely to take public transportation, work in roles that cannot be performed remotely, and live in densely populated areas or communal housing [26,27]. Each of these factors may have contributed to the higher levels of infection-induced seropositivity observed in our study. High socioeconomic vulnerability has also been associated with lower levels of COVID-19 vaccination uptake [28], which may further exacerbate disparities from the pandemic. Improving access to and uptake of COVID-19 vaccines among low-income populations are critical to reverse this pattern. Further, policies and practices aimed at reducing poverty levels may reduce the disparate burden of disease during future public health crises.
Since the onset of the pandemic in the United States, rural populations have been at higher risk of developing severe disease due to their disproportionately older age, lack of health insurance, lower SES, and higher risk of chronic diseases and disabilities compared to urban residents [5,29].
Rural residents have also been less likely to take preventive measures (e.g., COVID-19 vaccine administration and mask-wearing) that would curtail the spread of SARS-CoV-2 [29,30]. To address this differential risk, numerous efforts had been implemented to increase the availability of COVID-19 testing sites, availability of medical care during onset of severe COVID-19 illness, and availability of vaccine allocation sites in rural areas [29,30]. Still, our study highlights the rapid increase in infection-induced seroprevalence in rural counties during the study period. Other researchers have found similar evidence based on comparison of reported case incidence and mortality rates in rural vs.
A c c e p t e d M a n u s c r i p t 12 urban areas [5,31]. Further, this study found that the vaccination-induced seroprevalence in rural counties was lower compared to urban counties, reflective of the reported lower vaccination rates in rural areas [30]. Reducing disparities in COVID-19 vaccination (e.g., measures to reduce vaccine hesitancy) will be vital in reducing the effects of COVID-19 in rural communities.
Our study has several notable strengths. First, this nationwide serological survey with complete race/ethnicity and location information can provide a more thorough assessment of infection rate and distribution of past infections compared with case surveillance data. Second, the study used highly sensitive and specific assays for the serological testing [19], which maximized consistency in antibody testing across time and across geographic regions. Third, we applied two serological tests, one for anti-S antibody and one for anti-N antibody, to track the seropositivity induced by infection and by vaccination. Fourth, the study had a large sample size with over 1.5 million specimens.
However, our study also had several limitations. First, blood donors are usually healthier than the general population. Acutely ill persons, children aged <16 years, and others with exclusionary criteria cannot donate blood; elderly (aged >75 years) and institutionalized persons (from nursing homes, prisons, etc.) are unlikely to donate blood and are therefore under-represented in this study [32]. Racial/ethnic minority groups were also under-represented compared to the U.S. general population [32]. To address this, appropriate weights and standardization were applied to all participants to account for age and race/ethnicity factors as possible. In 2022, CDC is planning to conduct a modified blood donor serosurveillance study that can oversample from racial and ethnic populations. Second, we transformed 29% of the records from ZIP code to county based on probabilities, which was subject to error. Third, the use of aggregated indices on rurality and social vulnerability at community levels reduces the capability of detecting true effects when compared with studies with individual-level data on participants" SES, housing, and related factors. Fourth, waning antibodies can influence prevalence estimates, especially those based on anti-nucleocapsid serologic tests [33]. Lastly, we could not include Puerto Rico and other U.S. territories in this study due to issues with data availability and compatibility. Information and research are needed for territories as they may present unique challenges with regards to health equity.        M a n u s c r i p t 20 M a n u s c r i p t 22 Table 2. Weighted infection-induced seroprevalence rates (95% CI, in percent) in blood donors from 50 states and Washington D.C., U.S., July 2020-June 2021, by age, sex, race/ethnicity, census region, rurality level, social vulnerability (as measured by CDC"s Social Vulnerability Index or SVI), and the four SVI themes.