Psychosocial predictors of COVID-19 infection in UK biobank (N = 104 201)

Abstract Background Since the outbreak of COVID-19, data on its psychosocial predictors are limited. We therefore aimed to explore psychosocial predictors of COVID-19 infection at the UK Biobank (UKB). Methods This was a prospective cohort study conducted among UKB participants. Results The sample size was N = 104 201, out of which 14 852 (14.3%) had a positive COVID-19 test. The whole sample analysis showed significant interactions between sex and several predictor variables. Among females, absence of college/university degree [odds ratio (OR) 1.55, 95% confidence interval (CI) 1.45–1.66] and socioeconomic deprivation (OR 1.16 95% CI 1.11–1.21) were associated with higher odds of COVID-19 infection, while history of psychiatric consultation (OR 0.85 95% CI 0.77–0.94) with lower odds. Among males, absence of college/university degree (OR 1.56, 95% CI 1.45–1.68) and socioeconomic deprivation (OR 1.12, 95% CI 1.07–1.16) were associated with higher odds, while loneliness (OR 0.87, 95% CI 0.78–0.97), irritability (OR 0.91, 95% CI 0.83–0.99) and history of psychiatric consultation (OR 0.85, 95% CI 0.75–0.97) were associated with lower odds. Conclusion Sociodemographic factors predicted the odds of COVID-19 infection equally among male and female participants, while psychological factors had differential impacts.


Introduction
Over 80% of COVID-19 cases are mild but can shed viral copies causing transmission of the disease, 1 posing a public health challenge.However, most studies focus on clinical factors associated with severe COVID-19 outcomes. 2,3Studies published on psychosocial factors and COVID-19 reported on sociodemographic factors but are largely limited to severe COVID-19 outcomes. 4,5Defined as multidimensional constructs that influence well-being of a person, psychosocial factors such as anxiety, neuroticism, irritability, socioeconomic deprivation, level of education and ethnicity have been shown to influence health 6,7 and play a significant role in the development of both physical and mental illness. 8,9Higher socioeconomic deprivation is reported to be associated with higher morbidity from both infectious and non-infectious diseases, 10 and negative psychological factors such as depression and anxiety are associated with higher risk of developing medical conditions such as diabetes mellitus (DM 11 ) and respiratory illnesses. 12is study therefore aimed to explore the role of psychosocial factors in predicting COVID-19 infection in the UK Biobank (UKB), hypothesizing that sociodemographic factors: non-white British ethnicity, college/university degree, socioeconomic deprivation, and psychological factors: depression, anxiety, neuroticism, loneliness, miserableness, irritability and history of psychiatric consultation predicted the odds of COVID-19 infection among the UKB participants, particularly around the first wave of the infections in Spring 2020.

Methodology Study design and population
This study utilized secondary data provided by the UKB, a prospective cohort study that recruited volunteer participants between 2006 and 2010 and aimed to investigate determinants of disease among those aged 40-69 years until death. 13N = 9 million adults whose contact information were gathered from the NHS were mailed, out of whom ∼500 000 people were recruited.Data were generally collected using detailed touchscreen questionnaires.

Primary outcome variable
The primary outcome was COVID-19 test result.COVID-19 data were provided by the Public Health Scotland, Public Health England and Secure Anonymized Information Linkage for the Scottish, English and Welsh data, respectively, and linked to the UKB database. 14The data contained the type of the specimen collected and the date of collection, the laboratory that processed the samples, the origin of the samples (collected from inpatient or not) and COVID-19 test result as positive or negative.Test samples were collected from nose or throat swabs and sometimes from the lower respiratory system in intensive care unit settings and analysed for COVID-19 reverse transcriptase-polymerase chain reaction (RT-PCR).The data covered the period from 16 March 2020 when the results started being released to the UKB to 31 December 2020.Among the participants who had undergone multiple COVID-19 tests and with different test results, any positive test outcome was treated as a positive case.Negative cases were considered negative if the test result was negative for the participants who had a single COVID-19 test, or negative for all the tests among those who underwent multiple tests.This study was carried out under UKB project number 17689.

Predictor variables
Psychosocial predictor variables were taken during the UKB baseline assessments in 22 centers in England, Wales and Scotland from 2006 to 2010. 13,15

Sociodemographic factors
Age and sex were self-reported; sex categorized as male and female.Ethnicity and level of education were selfreported and subsequently derived as White British and Non-white British, and with and without college/university degree, respectively.Level of deprivation was measured using Townsend Deprivation Scores, a measure of the level of district material deprivation in the UK, comprising four variables: percentage of people who own cars, stay in overcrowded areas, are unemployed and do not own homes. 16ndividuals were assigned scores corresponding to their area post codes and area deprivation index.Higher scores indicated more socioeconomic deprivation.

Psychological factors
Seven psychological factors were explored.Neuroticism was measured using the revised version of the Eysenck Personality Questionnaire.Scores were summed into a continuous scale, with the higher values indicating more severe neuroticism.Depression, anxiety, loneliness, miserableness, irritability and history of psychiatric consultation were self-reported as 'Yes' or 'No'.[19]

Lifestyle factors
History of smoking was categorized as ever/never smoked, while history of alcohol intake as regular/non-regular drinker.Standing height was measured in centimeters using Seca 202 device, while weight was measured in kilograms (kg).Body mass index (BMI) was then constructed by dividing weight by squared height in meters (kg/m 2 ) and classified into: underweight (<18.5 kg/m 2 ), normal weight (18.5-24.9kg/m 2 ), overweight (25-29.9kg/m 2 ) and obese (≥30 kg/m 2 ) according to World Health Organization (WHO) classification. 20

Other comorbidities and biomarkers
Comorbidities such as DM, cancer, heart disease and hypertension were self-reported and/or confirmed by the medications that the participants were taking during assessments, where such medications were available.Blood pressure was taken with Omron 705 IT electronic BP monitor, for which two readings were taken and an average measurement calculated.Where automated measurement was not possible, a manual sphygmomanometer was used.Hypertension was then defined by BP more than or equal to 140/90 mmHg. 21ung function was expressed as forced expiratory volume (FEV1).FEV1 is a measure of obstructive lung diseases such as chronic bronchitis/emphysema and asthma. 22It was measured through spirometry using a Vitalograph Pneumotrac 6800 equipment.Furthermore, handling of blood samples and laboratory parameters such as C-reactive protein (CRP), cholesterol levels and adult glycosylated hemoglobin (HbA1c) were extensively described in a different manual. 23

Statistical analysis
Statistical analysis was conducted using R statistical software version R-4.1.1.Continuous variables were described using mean/standard deviation (SD) for normally distributed data, and median/interquartile range (IQR) for skewed data.Categorical variables were described as frequencies/percentages. Different models of logistic regression were used to explore the odds of COVID-19 infection.Model 1 involved univariate logistic regression.Model 2 was adjusted for baseline age and sex.Thereafter, multivariable analysis proceeded in two arms.For sociodemographic variables, model 3 was adjusted for psychological variables, while for psychological variables, model 3 was adjusted for sociodemographic variables.Models 4 and 5 involved adjustments for lifestyle factors and all covariates, respectively.Level of significance was set P < 0.05 with 95% confidence intervals.Sensitivity analysis was conducted using both complete case analysis and Hosmer-Lemeshow (HL) test of goodness of fit.Multicollinearity was tested using variance inflation factor (VIF).

Results of the descriptive characteristics
Out of 502 490 UKB participants, 473 089 were alive on 16 March 2020.N = 104 201 had COVID-19 test results for the study period, out of whom 14 852 (14.25%) tested positive.The mean age of those who tested positive and negative was 52.86 (SD 8.41) and 57.53 (SD 7.96), respectively (Table 1).

Overall regression results
The whole sample analysis revealed that univariately, all sociodemographic variables examined were significantly associated with increased odds of infection.Among the psychological variables, higher baseline neuroticism, miserableness and irritability were associated with increased odds, while history of psychiatric consultation with reduced odds (Supplementary Table 1).On multivariable regression, absence of education degree and Townsend score remained significantly associated with increased odds: odds ratio (ORs) of 1.56 (1.48-1.64,< 0.0001) and 1.14 (1.10-1.17,< 0.0001), respectively, while loneliness and history of psychiatric consultation with reduced odds, ORs of 0.90 (0.84-0.96, 0.0019) and 0.85 (0.78-0.92, <0.0001), respectively (Supplementary Table 2).
The overall multivariable analysis, however, showed evidence of poor fit with P-values of 0.067, 0.057, 0.0063 and 0.0008 for HL groups 8, 10, 12 and 20, respectively.There was also significant interaction between sex and several predictor variables; sex with: education degree (P-value of 0.0138), ethnicity (0.0006), depression (0.0110), neuroticism (0.0060), miserableness (0.0051) and irritability (<0.0001),where COVID-19 was the outcome.Therefore, sub-analysis according to sex was conducted and the regression results presented as female-only and male-only results.

Female-only regression results
Regarding sociodemographic variables, being without degree and higher Townsend score were independently associated with the increased odds of COVID-19 infection, OR 1.55 [95% confidence interval (CI) 1.45-1.66,<0.0001] and OR 1.16 (95% CI 1.11-1.21,<0.0001), respectively.Regarding the psychological factors, only history of psychiatric consultation was independently associated with the reduced odds, OR 0.85 (95% CI 0.77-0.94,0.0014) (Table 2).Descriptive characteristics and univariate analysis for the females were as in Supplementary Tables 3 and 4.

Sensitivity analysis
Complete case analysis did not yield any statistically different results (Supplementary Tables 7 and 8).HL test of goodness of fit showed that the regression models used were of good fit with statistically nonsignificant P-values (Supplementary Table 9), and there was no multicollinearity, with all the VIFs being <5 (Supplementary Table 10).

Discussion
The main findings of this study This study showed that absence of college/university degree and socioeconomic deprivation were independently associated with the higher odds of COVID-19 test positivity,  and this did not substantively vary by sex.Psychological variables on the other hand were associated with the lower odds of COVID-19 test positivity, differentially according to sex.Notably, history of psychiatric consultation was associated with the lower odds equally among both female and the male participants, whereas loneliness and irritability were additionally associated with lower odds among male participants only.

What is already known on this topic
8][29][30] Batty et al. 29 and Lee et al. 27 did not find any significant association between history of psychiatric consultation and COVID-19 infection.Taquet et al. 28 and Wang et al. 30 reported an increased risk, while Rozenfeld et al. 24 reported that a diagnosis of a serious persistent mental condition reduced the odds of COVID-19 infection by 23%, OR 0.77 (95% CI 0.65-0.92).

Sociodemographic factors
The finding that socioeconomic deprivation increased the odds of COVID-19 infection by ∼14% was congruent with other published studies. 5,24][33] Absence of college/university degree was associated with the increased odds of COVID-19 infection by >50%.Notably, the findings in this study revealed much higher odds than those reported by those reported in the previous literature.This could be attributed to the difference in the sample sizes.Furthermore, available evidence also shows that lower education and literacy is associated with low health literacy and health status. 34,35This may result in failure to understand health related information 36 and negatively influence health seeking behavior. 35Notably, a lot of information regarding safety measures against COVID-19 such as social distancing and wearing of face masks were released and evolved during the outbreak.This could lead to difficulty in keeping up with such information especially among people with low level of education, hence increasing the risk of the infection.A COVID-19 Snapshot Monitoring Study in Germany showed that people with low level of education performed substantially lower than those with higher level of education with regards to adherence to social distance. 37

Psychological factors
History of psychiatric consultation, which was used as a proxy to history of having mental health conditions was independently associated with the reduced odds of COVID-19 infection equally among the male and female participants.
8][29] It could be possible that people with the history of psychiatric consultation did not freely associate with others, hence protecting them from contracting COVID-19.
With the current mixed evidence, further studies are still needed in this area.This, to the best of our knowledge, was the first study to explore the effect of loneliness in predicting COVID-19 infection.6][47] Therefore, noting that COVID-19 is mainly spread by close contact with people who are infected, 48,49 it could be possible that people with loneliness withdrew from others, hence avoiding social contacts and resulting in the protective effect observed in this study.
Finally, this was also the first study to explore irritability and COVID-19 infection, showing a protective association.Previous literature suggests that social support can improve psychological resilience 50 and reduce the negative impacts of anger and irritability through positive anger disposition and coping mechanisms, 51 hence positively influencing health.Although social support was not explored in this study, it could be possible that social support modified the effect of irritability in this study, hence the protective findings, though further studies are still needed.

Sex differences
3][54] Although effect sizes of most psychological variables examined were higher among women than men in this study, more psychological factors were significantly protective among men than women.This was an interesting finding noting that the risk of COVID-19 infection and severe outcome had been reported to be significantly higher among men than women in other studies. 3,5,25

Strengths and limitations
This study had the following strengths.First, to the best of our knowledge, it was the first study to examine the role of psychosocial factors in predicting the odds of COVID-19 infection in the general population, hence more applicable in providing evidence on how psychosocial factors influenced COVID-19 infection during the first year of its outbreak.Second, the large sample size used made it more generalizable.Third, utilizing the data from the UKB prospective cohort study minimized the likelihood of reverse causality.Fourth, COVID-19 test results were obtained from linkage of the UKB data with NHS.In addition, the samples were tested for the COVID-19 RT-PCR.This meant that the outcome variable in this study was accurately confirmed, hence reliable.Although false positive cases could still be possible, it was not a major concern since RT-PCR is the recommended diagnostic test for COVID-19 by the WHO. 55his study, however, had the following limitations.First, participants were mainly volunteers who would be healthier than the general population, hence would lead to selection bias.Selection bias could also be induced by including only participants with a test result in this study.Second, participants represented only 5.5% of the invited population, with significant under-representation of the non-white British ethnicity in particular, which affects generalizability.It is possible there are differences in risk factor/COVID-19 associations in non-White British sample, which this study was not statistically powered to test for.Third, baseline characteristics of the participants at the UKB were taken >10 years ago.Previous longitudinal data in UKB over mean 5 years suggests reasonable stability in individual differences such as cognitive health. 56Noting that baseline characteristics such as age, socioeconomic status and education change over time, it was possible that some of the participants were not correctly classified in their respective categories, hence misclassification bias.Fourth, COVID-19 vaccination status of the participants was not controlled for in this study.Noting that UK was among the first countries to roll out COVID-19 vaccination, failure to control for it would lead to a bias.Finally, using COVID-19 test results would mean that those who had undergone the test but were still awaiting the results and those who died of COVID-19 between January 2020 when the first case of COVID-19 was reported in the UK and 16 March 2020 when COVID-19 test results started being released to the UKB were excluded from analysis, which would underestimate the size effects.

Conclusion
Sociodemographic factors significantly predicted COVID-19 infection at the UKB.These findings are important in the development of the primary prevention strategies.However, being new findings especially with regards to the effects of the psychological factors, more replication studies are needed before translation of these findings into policies.Index Advisory Group in Scotland and Patient Information Advisory Group in England and Wales.Participants provided full informed consent to participate and for publication of the research findings.This study was also covered by the generic ethical approval for UKB studies from the NHS National Research Ethics Service (approval letter dated 17 June 2011, Ref 11/NW/0382). 57

Table 1
Whole-sample baseline characteristics a In mg/L.b In mmol/mol.c In mg/L.

Table 3
Multivariable logistic regression outcomes of the males Townsend score, Smoking Status, Regular Alcohol Intake, BMI Categories, Diabetes mellitus, Cancer, Bronchitis/Emphysema, Heart Disease, Asthma, Hypertension, CRP, HbA1c and Cholesterol.e Model 3P: Model 2 + Non-white British Ethnicity, College/University Degree and Townsend score.