Host–microbiome associations in saliva predict COVID-19 severity

Abstract Established evidence indicates that oral microbiota plays a crucial role in modulating host immune responses to viral infection. Following severe acute respiratory syndrome coronavirus 2, there are coordinated microbiome and inflammatory responses within the mucosal and systemic compartments that are unknown. The specific roles the oral microbiota and inflammatory cytokines play in the pathogenesis of coronavirus disease 2019 (COVID-19) are yet to be explored. Here, we evaluated the relationships between the salivary microbiome and host parameters in different groups of COVID-19 severity based on their oxygen requirement. Saliva and blood samples (n = 80) were collected from COVID-19 and from noninfected individuals. We characterized the oral microbiomes using 16S ribosomal RNA gene sequencing and evaluated saliva and serum cytokines and chemokines using multiplex analysis. Alpha diversity of the salivary microbial community was negatively associated with COVID-19 severity, while diversity increased with health. Integrated cytokine evaluations of saliva and serum showed that the oral host response was distinct from the systemic response. The hierarchical classification of COVID-19 status and respiratory severity using multiple modalities separately (i.e. microbiome, salivary cytokines, and systemic cytokines) and simultaneously (i.e. multimodal perturbation analyses) revealed that the microbiome perturbation analysis was the most informative for predicting COVID-19 status and severity, followed by the multimodal. Our findings suggest that oral microbiome and salivary cytokines may be predictive of COVID-19 status and severity, whereas atypical local mucosal immune suppression and systemic hyperinflammation provide new cues to understand the pathogenesis in immunologically compromised populations.


Significance Statement
The oral mucosa is one of the first sites encountered by bacterial and viral infections, including severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).It consists of a primary barrier occupied by a commensal oral microbiome.The primary function of this barrier is to modulate immunity and provide protection against invading infection.The occupying commensal microbiome is an essential component that influences the immune system's function and homeostasis.The present study showed that the host oral immune response performs unique functions in response to SARS-CoV-2 when compared to systemic responses during the acute phase.We also demonstrated that there is a link between oral microbiome diversity, salivary cytokines and clinical severity.Additionally, the salivary microbiome was predictive of not only disease status but also host response levels.

Introduction
The oral mucosa is one of the first sites where infections are encountered and is one of the primary barriers maintaining a low-level inflammatory response to the resident commensal microbiome (1).This barrier effectively regulates immunity and protects the body against the breakdown of defense mechanisms leading to the development of oral or systemic diseases (1).This defense mechanism involves complex interactions between the immune system, the microbiome, and the physical barriers in the oral cavity, such as saliva and mucosal tissue (2).A balanced healthy microbiome is an essential component of the host and has often been considered an organ and protective component of the human body (3), which is involved in maintaining homeostasis, epithelial cell proliferation and differentiation, and in the regulation of immune response (4).Well-established evidence showed that the immune defense of the mucosal barrier is built independently of commensal colonization and that disruption of homeostasis between the oral microorganisms leads to certain diseases (1).The interplay between the microbiome and the host forms a complex ecosystem that serves as both a metabolic and physical barrier to outcompete external microorganisms (3), such as viral invaders.
The impact of the mucosal microbiome during coronavirus disease 2019 (COVID- 19) is an emerging area with ongoing research.Recent evidence suggests that the severity of COVID-19 was related to microbiome changes, as SARS-CoV-2 replicates in cells expressing angiotensin-converting enzyme 2 (ACE2) and transmembrane protease serines 2 receptors, both of which are found in the oral and nasal mucosa (4).These ongoing investigations are starting to reveal the relationship between the oral microbiome and disease pathogenesis.The decrease in oral microbial diversity was correlated to COVID-19 patients in intensive care unit (ICU) (5), while others found that no changes in the nasal microbiome in mildly symptomatic COVID-19-positive individuals compared to controls (6).Although evidence is emerging, there are no molecular cues with predictive potential to understand symptoms from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and risk to severity.While systemic immune responses can be monitored through blood sampling, it does not depict the mucosal responses from a host-microbial perspective (7).Emerging new immune response evidence (present in fluids such as saliva and plasma) has started to demonstrate changes in saliva from subjects that experienced and resolved from SARS-CoV-2 infection (8)(9)(10)(11)(12), yet cytokine responses are sparse (8,9,11,12).It is consequently important to document molecular signatures in COVID-19 patients at the mucosal level to better define normal and pathogenic convalescence processes and to detect the possible initiation of aberrant innate immune activation, especially in fluids that are in direct contact with SARS-CoV-2.Although systemic immune responses can be monitored through blood sampling, understanding oral immune competence is complex (13).
Here, in a case-control study, we measured and compared SARS-CoV-2 oral mucosal responses in matched saliva and plasma samples and investigated how the host-microbiome in saliva can predict COVID-19 status and severity.We characterized host functional cytokines impaired by SARS-CoV-2 infection by comparing responses in acute COVID-19 subjects to healthy controls and by comparing responses in saliva to plasma.This case-control study aimed to evaluate whether host-microbiome in saliva can predict COVID-19 status in addition to severity.Overall, our findings highlight saliva as a stable and accessible biofluid to monitor SARS-CoV-2 host impact, including microbiome and cytokines associated with of disease severity.Saliva has the potential to reveal fundamental biological mechanisms of health and disease.

Results
Dysregulated systemic immune responses to SARS-CoV-2 infection impact COVID-19 disease severity, mortality, and the development of chronic diseases.Both hyperinflammatory systemic r and immune suppression responses have been found as viral evasion strategy.As body fluids are exposed to different antigens, we investigated how the body responds to SARS-CoV-2 in saliva compared to those in matched serum.Paired saliva and serum samples from COVID-19 donors in the acute phase were collected and subjected to comparative analyses among demographic factors (age, gender, and initial COVID-19 disease severity) (Table 1).
A total of 80 individuals were included in the study, consisting of 50 SARS-CoV-2 positive individuals confirmed by RT-PCR from nasopharyngeal swabs and 30 noninfected individuals.The severity of the disease was categorized as mild (hospitalized without oxygen therapy, n = 11), moderate (hospitalized with lowflow oxygen <10 L/min, n = 28), and severe (hospitalized with high-flow oxygen >10 L/min, n = 11).The analysis combined the mild and moderate groups into a "nonsevere" category.Of the participants, 55% were males and 45% were females.Among the SARS-CoV-2 positive group, 56% were older than 50, and 64% were older than 50, among the severe cases.Diabetes was present in 44% of the SARS-CoV-2 positive group, while only 9% of the subjects with severe symptoms had diabetes.Hypertension was present in 44% of SARS-CoV-2-positive individuals and 36% of those with severe symptoms.Obesity was observed in 37% of SARS-CoV-2-positive individuals and 40% of those with severe symptoms.All subjects with severe symptoms had a chronic cough.
Respiratory distress was reported by 82% of the SARS-CoV-2 positive group, while shortness of breath was reported by 82% of the severe cases at the time of data collection.Additionally, 45% of subjects with severe symptoms experienced persistent chest pain, 36% had sputum production, and 55% had a loss of taste and smell.Only 18% of subjects with severe symptoms had fever.
The demographic variables are presented in Table 1.

Microbial composition stratified samples by COVID-19 diagnosis better than cytokines
We evaluated the microbiome composition detected in saliva to elucidate the effect of disease on the microbiome.First, we conducted a principal coordinates analysis (PCoA) on Jaccard dissimilarity, which showed distinct clusters between control, mild/ moderate cases, and severe cases along the first PC-axis, indicating that COVID-19 affects the microbiome composition during the acute phase of infection (P < 0.05, ADONIS: Permutational Multivariate Analysis of Variance Using Distance Matrices on Jaccard distances with false discovery rate (FDR) correction, Fig. 1A).A similar pattern was not seen in either salivary or blood cytokines (P > 0.05, ADONIS, Fig. S1).To understand what drives this microbial-enhanced separation between the different conditions, we examined the alpha diversity.The control group had statistically higher diversity compared to the mild/moderate group, which in turn had statistically higher diversity than the severe group (P < 0.05, observed features after FDR correction, Figs.1B and S2).The abundance-aware Shannon diversity demonstrated that the relationship between the mild/moderate and control groups was not statistically significant (P > 0.05, Fig. 1C); therefore, the difference in observed amplicon sequence variant (ASV) was driven by rarer taxa.We next conducted a PCoA on Jaccard dissimilarity, which showed distinct clusters between control, mild/moderate cases, and severe cases along the first PC-axis, indicating that, indeed, COVID-19 affects the microbiome composition during the acute phase of infection (P < 0.05, ADONIS on Jaccard distances with FDR correction, Fig. 1A).To examine this effect further, we broke Jaccard dissimilarity into its two components; replacement of taxa with another (turnover) and whether one sample is a subset of another (nestedness) (15).Nestedness was statistically significant between control and mild/moderate cases, but not turnover (P = 0.012 and 0.299) respectively, ANOSIM (A non-parametric test based on the rank distances among sample units after FDR correction).On the other hand, the difference between mild/moderate and severe cases was due to turnover and not nestedness (P = 0.003 and 0.365).Composition-aware quantitative analysis of the microbiome also showed a similar pattern of clustering between control, mild/moderate, and severe cases (P < 0.05, ADONIS of PhILR distances after FDR correction,   1D).In addition, alpha rarefaction curves demonstrated that sufficient sample sequencing was achieved therefore the effect is not due to insufficient sequencing depth (Fig. S3).
To gain insight into the capability of the oral variables in distinguishing COVID-19 status, we used unsupervised clustering (Ward hierarchical clustering) on the samples using their salivary microbiota and the salivary cytokines.The results revealed that the salivary microbiota clustered most of the COVID-19-positive patients along the first tree bifurcation, while the salivary cytokines were not able to show the same distinction (Fig. S4).This finding was reflected in the predictive accuracy of each variable, where the microbiota outperformed the cytokines (Fig. S4).

Blood and saliva cytokines reflect different host responses to COVID-19
We profiled saliva and serum samples to characterize oral mucosal and systemic responses following SARS-CoV-2 infection more comprehensively.Out of a 65-cytokine assay, 15 salivary and 45 blood cytokines were significantly different between SARS-CoV-2 positive and negative patients (P < 0.05, Kruskal-Wallis test, Fig. 2).Only nine of these cytokines were shared between the two profiles Furthermore, the salivary hepatocyte growth factor (HGF) and fibroblast growth factor 2 were statistically significantly lower in COVID-19-positive patients compared to the controls (<0.05) but not in blood.Additionally, salivary tumor necrosis factor beta (TNF-beta), interleukin 10 (IL10), monocyte chemotactic protein-2/C-C motif chemokine ligand 8 (MCP2/CCL8), epithelial cell-derived neutrophil-activating peptide 78/C-X-C motif chemokine ligand 5 (ENA78/CXCL5), CD40 ligand (CD40L), interleukin 2 receptor, interleukin 12 (IL12p70), and HGF were found to be expressed less in the mild/ moderate condition compared to either severe or control (Figs.S5 and S6), possibly indicating an inverse relationship where the mucosal cytokine expression is suppressed.We then determined the influence of the immune subclusters in relation to each fluid by investigating correlations.To confirm this, we examined the relationship between blood and salivary cytokines in each condition, which confirmed that while the relationship in non-SARS-CoV-2 infected individuals was positive, the inverse relationship between salivary and blood cytokines is present in severe cases (Fig. 2C-F).
Our results indicate that measuring only serum levels during the COVID-19 convalescent phase does not provide a full picture of the host response mounted after SARS-CoV-2 infection.Indeed, while we confirmed that our acute COVID-19 subjects had mounted a systemic response in blood, our salivary analysis revealed novel immune signatures with downexpression in proinflammatory cytokines for severe subjects and not for the other disease and healthy control groups.With altered microbiome patterns, severe subjects showed clear separation between the healthy controls separating the groups just via the microbiome.Overall, we demonstrated that population-based investigations of saliva can be used to map global host-microbial responses to local mucosa and begin to integrate to systemic functions.

Microbial perturbation and cytokines serve as a robust classifier for COVID-19
We next investigated the feature selection algorithm (Clairvoyance ( 16)) in a hierarchical manner (Fig. 3A) to better understand the strength of the signal that the microbial perturbation and cytokine fluctuations are serving as a predictive/diagnostic tool for SARS-CoV-2.We examined the strength of the potential sources of salivary and blood biomarkers and their capability in four Fig. 2. Heatmaps of salivary and blood cytokines showing statistical significance and positive and negative correlations.A and B) Groups for control (black/left group), mild/moderate (pink/middle group), and severe (red/right group) show statistical significance by Kruskal-Wallis for blood (A) and saliva (B) cytokines.C-F) Correlogram between blood (rows) and salivary (columns) cytokines for controls (C), all COVID patients (D), mild/moderate category (E), and severe category (F).Significant correlations areas were colored postively (blue) and negatively (red) both with high contrast after the Spearman correlation Rho.Any correlation did not show statistical significance (P > 0.05) was removed and replaced with a white square (low contrast).See Table S4 for the full list of cytokines.different manners: salivary microbiome only, blood cytokine only, salivary cytokine only, and multimodal sample-specific perturbation networks (SSPN) including the three parameters stratified by COVID-19 status and severity.Feature selection analysis was performed on (y1) uninfected vs. infected (n = 80 samples); and (y2) mild/moderate vs. severe (n = 50 samples).Our feature selection ended up with 42 features in y1 and 39 features in y2 which are fewer than the number of samples used for modeling and, thus, not subject to bias due arising from the curse of dimensionality (17).The feature selection results are shown in Fig. 3B and  C. We demonstrated that the microbiome-only model had the highest classification capabilities, followed by the multimodal SSPN (microbiome: 99.4 and 100%; multimodal SSPN: 99.5 and 91.2% for submodal y1 and y2, respectively) compared to the salivary cytokine and blood cytokine alone models.The potential markers for the microbiome did not overlap between the two submodels, with y1 being represented by 10 ASVs, and y2 being represented by 6 ASVs (Table S2).Interestingly, most nodes in the SSPN were shared across the two submodels, inferring that the interactions (edges) between these nodes are drivers to the distinction between the two submodels.None of the microbial biomarkers were shared between submodals y1 (n = 10 ASVs) and y2 (n = 6 ASVs) while all the other modalities shared features between submodals y1 and y2 (Tables S2 and S3).Central to these interactions is the systemic cytokine CXCL10/induced protein 10 (IP-10).
A major advantage of analyzing edge perturbations using feature selection algorithms is the ability to reconstruct edges into a network providing an opportunity to reap the benefits of graph theory.In this context, we reconstructed our aggregate networks (AN) using the multiomic biomarkers and their fitted model weights (i.e.logistic regression coefficients) as edge weights with submodels y1 and y2 representing AN y1 and AN y2 , respectively (Fig. 3).As the edge weights represent model coefficients, they can be directly interpreted as their predictive capacity in classifying stepwise severity with positive coefficients translating to a higher likelihood of a sample to be classified as more severe and vice versa for negative coefficients.
Comparing overlap in network context can be performed at the node and edge level.For AN y1 and AN y2 , there was a higher proportion of nodes shared (57.6%) compared to shared edges (20.9%) between both submodels, indicating a common set of biomarkers useful for diagnosing COVID-19 phenotypes but the way those biomarkers interact change with severity.
In both AN y1 and AN y2 , the most notable node is the systemic cytokine CXCL10/IP-10 which has disproportionately high connectivity relative to the other nodes and is highly centralized (Figs. 4 and S7).AN y1 and AN y2 have similar network structures, but AN y1 node degree distribution is less homogenous than AN y2 with homogeneity indices of 80.7 and 88.3%, respectively.In other words, the predictive capacity of systemic cytokine CXCL10/IP-10 is disproportionately higher than the other nodes in classifying COVID-19 diagnosis (submodel y1) relative to COVID-19 severity (submodel y2).In submodel y2, the classification of COVID-19 severity is more evenly distributed across the other nodes (Fig. S7).Modality-specific biomarker overlap between submodels of B) abundance-based paradigms and C) inferred-interaction paradigms.In the inferred interaction paradigm, features are edges in the SSPN which can be decomposed into nodes which is not the case for the abundance-based paradigms in (B).Clairvoyance feature selection results for D) submodel y1 and E) submodel y2 color coordinated by modality.Each marker on the scatter plot represents a unique hyperparameter/feature set combination that yields a specific accuracy using 10-fold cross-validation repeated with 10 different random states.

Discussion
Increasing evidence indicates that the immune system of COVID-19 subjects may be compromised when compared to healthy controls (18).Most investigations have focused on systemic body compartments such as blood and lung, and emerging investigations are dissecting the nasal mucosal responses.We demonstrated that inflammatory response to viral infection is regulated by microbiome changes locally and hypothesize that systemic responses are not predictive of systemic changes.The present analysis demonstrated that the salivary microbial community is highly susceptible to perturbation in the presence of Fig. 4. Aggregate network representations of fitted submodels showcasing predictive capacity-aggregate network for submodel (A) y1 and (B) y2.The edge weights can be interpreted as predictive capacity for COVID severity.For y1, positive values indicate that an increase in perturbation results in an increased likelihood that a sample is classified as y2 (i.e.infected) relative to uninfected.For y2, positive values indicate that an increase in perturbation results in an increased likelihood that a sample is classified as severe relative to mild/moderate.Node size is proportional to weighted degree as a measurement of network connectivity and, by extension, biomarker importance.C) ASVs in the SSPN as proportions of presence in the various conditions, demonstrating that the majority of the ASVs are due to differences in abundances of common taxa across all conditions, and not rare species.SARS-CoV-2, leading to a distinct separation of the salivary microbiome between healthy and COVID-19-infected individuals.We also showed that the salivary microbiome was distinct between individuals with severe and mild/moderate symptoms.Our findings also demonstrate that abnormal inflammatory responses can be identified in both saliva of COVID-19 subjects, which was shown to be severity specific.This indicates that even when SARS-CoV-2 systemic responses are mounted, the local immune response does not necessarily define disease resolution.Here, we highlight saliva as an important and accessible fluid that can be monitored to identify not just antibody responses but also diverse host-microbial pathways, including mucosal immunity, and innate immune responses.
The variation of the microbial membership detected in individuals with severe and mild/moderate symptoms can be attributed to the replacement of taxa with other taxa (turnover).However, the separation between groups (health vs. disease) was mainly attributed to different phenomena; we found a decrease in the number of salivary bacterial members in the mild/moderate cases compared to the control cases, with the taxa of mild/moderate being a subset of the control.This decrease was not seen when adjusting for the overall abundance of the microbiome (Shannon diversity).Therefore, our findings suggest that the transition from control to mild/moderate disease status preserves most of the microbial membership and is mainly driven by members with higher abundance; that is, the loss of microbial membership in the mild/moderate cases occurred in less abundant taxa.This demonstrates that the salivary microbiome is generally welladapted to perturbations, where these bacteria can maintain the overall community structure despite the change in the mucosal inflammatory environment during the acute phase of infection.Conversely, the transition from mild/moderate to severe disease corresponds to a larger environmental stress, causing a further collapse in microbial richness, thus allowing for the replacement of some taxa with others (turnover).However, when accounting for the overall collapse in membership and richness, this replacement of taxa was not enough to reverse the trend of reduced alpha diversity in severe cases.
Carefully designed immune studies aimed at implementing cytokine testing by investigations of blood-derived fluids but not saliva.Although it is well-known that mucosal and systemic immunity responds differently to pathogens (19), this study adds to the evidence that each immunity system responds in a distinct manner to SARS-CoV-2 locally.Our findings showed that the oral response was distinct from the systemic response.The present study demonstrated that systemic cytokines showed higher levels in mild/moderate cases compared to control cases, which is expected.However, salivary cytokine levels were lower in mild/moderate cases compared to control cases.Even though the wealth of evidence agrees with the fact that cytokine production increases during the disease's status, it was suggested that reducing cytokine production can be a strategy to manage cytokine storms and other inflammatory reactions (19).Thus, oral mucosal surfaces might lower cytokines in certain situations to balance between their two opposing roles: fighting pathogens and immune surveillance.It is worth noting that while saliva and serum samples were collected at the same time point from individuals within 48 hours of qualitative PCR-confirmed nasal pharyngeal COVID-19 diagnosis, delays in processing samples can lead to changes on the immune content.A study showed, that sample processing delay of plasma or serum caused perturbations of numerous cytokines (range −10.8 to 43.5%), frequencies of peripheral blood cytokines and gene expression (10).In our study, while there is variability on the recruitment of subjects, we kept an uniform collection protocol of saliva and plasma with fast processing of samples prior to freezing and storage.When normalizing cytokine concentration, we did not detect major variance across the subjects within the groups.
Data collection was strategically conducted during standard working hours, from 8 AM to 2 PM, to minimize variations in circadian rhythms.The timing of our data collection was chosen to reduce variability and to specifically exclude night-time sampling, thereby enhancing the reliability of our findings.Both saliva and blood samples were collected at the same time, thereby capturing both local and systemic responses.Current evidence does not conclusively support the idea that circadian rhythms differentially affect cytokines in saliva compared to blood.Most research in this area has focused on blood samples and specific cytokines, such as TNF and IL-6, with less attention paid to saliva (20).
We further demonstrated that salivary IL8 responses to SARS-COV-2 could be involved in neutrophil interactions at the oral mucosal surface.Significant correlation between salivary IL8 with blood markers, such as IL23, IL2, IL15, GROalphaCXCL1, IL12p70, IL6, TNFalpha, IL18, SDF1alpha, IL13, and IFNalpha demonstrate the tight coupling required for immune surveillance where a variety of cytokines are expressed for immune-cellular recruitment and activation of innate immune cells.This relationship is not maintained during COVID-19 infections.Additionally, we showed that the oral mucosal response was not linear; for example, salivary interferon gamma-IP-10 demonstrates lower expression in mild/moderate conditions compared to healthy and severe cases.The importance of IP10, a proinflammatory chemokine for T-lymphocytes, monocytes, and NK cells, has been shown in other viral infections such as HIV and Hepatitis B (HBV) infections.For example, intra-organ (liver) IP10 mRNA expression and the IP10 blood levels are concurrent in patients with chronic HBV infections (21).While HIV disease progression is positively correlated with levels of IP10 in circulation (22).The positive relationship between infection level and IP10 was not found in our study, thus indicating a different response path to infection.Understanding the effects of these cytokines helps us understand the host response to the infection.IL-10 and CD40L are responsible for IgA class switching of immunoglobulins (23,24).Salivary IL12 was also reduced in mild/moderate cases compared to the severe group.IL12, a critical cytokine for the activation of dendritic cells, B-cells and macrophages, stimulates the differentiation of T cells into Th1 cells.IL-12 also promotes the production of interferongamma (IFN-γ).
Taken together, the salivary cytokine profile of mild/moderate individuals supports the notion that the infection has not switched to a large-scale adaptive immune response to the virus.Further, it was suggested that the innate immune system in the oral cavity acts as the first line of defense against invading microorganisms (25), and several components of the innate immune system are involved in oral immunity, including saliva, oral epithelial cells, and other oral immune cells (25).Saliva acts as a component of innate immunity with antimicrobial activity (1), and oral epithelial cells are actively implicated in immune regulation, promote defensive immune responses, and create a constitutive tolerogenic environment that maintains immune homeostasis (26).
The relationship between the oral cavity and systemic conditions is often neglected, despite ample evidence of its connection to a multitude of diseases in distant organs, such as pneumonia (27), low birth weight (28), obesity (29), diabetes (30), and coronary disease (31).The pharynx structurally bridges the oral and nasal cavities, both of which express the ACE2/TMPRRS2 receptors that play a fundamental role in the SARS-CoV-2's ability to infect these niches.In addition, it also contains pharyngeal tonsil tissues, an instrumental induction site for oral mucosal IgA response to respiratory viral infection.The salivary gland also plays a major role as an effector site to produce T-cell-dependent antigen-specific oral IgA response.ACE2 receptors on highly differentiated epithelial cells are also strongly associated with cytokine levels in these tissues (32,33).Furthermore, mouse models have demonstrated that the resident microbiota on mucosal surfaces can protect against rotavirus infections (34), and it has been suggested to impact SARS-CoV-2 infections (35).The state of the mucosal tissues, such as the oral cavity, by way of perturbation of the resident microbiome and local cytokine profile, reflects the status of the host and can be used to predict and monitor the overall response.These perturbations should be further understood to gain insight into the host response of the mucosal surfaces of the upper respiratory tract.
While periodontal disease is known to influence levels of salivary biomarkers (36), conducting a comprehensive periodontal examination during the COVID-19 pandemic posed significant challenges.To circumvent this, our study employed a proxy questionnaire designed to assess indicators of periodontal health, including bone loss, gum inflammation, and the presence of loose teeth.This approach enabled us to establish a binary variable, "periodontal status," which classified subjects based on their selfreported oral health conditions.However, upon conducting a chi-square analysis, as detailed in Tables S2 and S3, we found no significant differences in periodontal status between subjects infected with COVID-19 and those who were uninfected (P-value >0.05).Additionally, our analysis revealed no statistically significant variation in periodontal status across different COVID-19 severity groups.This finding suggests that, within the limitations of our study, periodontal disease as assessed by our questionnaire method may not be a differentiating factor in COVID-19 infection or its severity (36).
Future studies would benefit from requiring convalescent COVID-19 subjects to report possible chronic symptoms longitudinally.The use of saliva for health monitoring has received significant attention due to its ease and simplicity of use.This study sought to explore this concept in relation to the extreme case of an exogenous infection that has recently jumped into immunologically naive humans, making it the ideal scenario for such examinations.This is particularly relevant given the evidence that depending on cycle threshold (CT) values produced by RT-PCR reflect the host's capability to clear the virus and not the infectiousness potential of the shedding particles themselves (35,37,38).By closely studying this unique case, the potential of saliva monitoring as a diagnostic tool can be further understood and improved upon.
In summary, our study has found the salivary microbiome to be a promising source to predict COVID-19 status and severity and that the oral microbiome plays an important role in the immune response by stimulating or suppressing the immune system.Our study supports the notion that the salivary microbiome could be used as a potential diagnostic tool for COVID-19 and predict severity (39).While these studies are compelling, more research is needed to determine the clinical utility of using the salivary microbiome as a diagnostic tool for COVID-19.However, the potential for a noninvasive, rapid, and convenient diagnostic test using salivary microbiome analysis is exciting and warrants further investigation.

Limitations
This study had a small sample size and may limit our ability to detect the significant distinction between different groups of symptom severity.Our COVID-19 cohort included only symptomatic subjects with mild, moderate, and severe symptoms.Asymptomatic individuals with COVID-19 may exhibit different microbiome characteristics and might be included unintentionally in the control uninfected group.In addition, the salivary microbiome can be influenced by periodontal diseases that we did not control for in the present study.Our study had limitations and future studies are needed to continue this work.While our samples were collected longitudinally, this analysis was performed for the first time point to depict the acute phase.Microbiome and inflammatory responses were not adjusted by the different baselines of each individual intervisibility.While study subjects were able to report how many days, they manifested acute COVID-19 illness (asymptomatic, mild, or moderate/severe), clinical recovery was complicated by the use of national protocols and medications such as antibiotics and immunosuppressives.
It should be noted that the microbiome results may reflect the population's ethnicity and country of residence (40), thereby are not generalizable to all populations.The feature selection was performed on (y1) uninfected vs. infected (80 samples); and (y2) mild/moderate vs. severe (50 samples).Our feature selection ended up with 42 features in y1 and 39 features in y2 which are fewer than the number of samples used for modeling.To further explore this, a multimodal analysis was conducted to identify biomarkers that could distinguish between health/disease and disease severity, with the expectation that the host response to an external infection, as in COVID-19, should be mostly conserved across different populations, with differences related to quantity, rather than activation of completely different pathways.While this study offers preliminary biomarker results, it is important to consider that the salivary microbiome is only one of many factors that could influence the course of COVID-19 and further research, such as longitudinal studies, is needed to fully understand the impact of other factors such as age, underlying health conditions, and genetic susceptibility (41).

Materials and methods
This study was a collaborative joint study between the Harvard School of Dental Medicine (HSDM), J Craig Venter Institute (JCVI), the Ministry of Health in Kuwait, and the University of Alberta.The study was approved by JCVI, Harvard, the Kuwait Ministry of Health, and the University of Alberta.Informed consent was obtained from all enrolled participants (Kuwait Ministry of health ethics approval: #2020/1462 Harvard: IRB21-1492, University of Alberta: Pro00125245, JVCI: exempt due to secondary analysis of de-identified samples).

Study design
A convenient sample strategy was used to recruit patients admitted at multiple Covid-19 centers in Kuwait between 2020 July 24 and September 4. The data collection took place in multiple hospital sites at three hospitals in Kuwait: AlFarwaniyah Hospital, Jaber Al Ahmed Hospital, and Kuwait Field Hospital.Data were collected from those who provided positive consent and who tested positive for SARS-CoV-2 by RT-PCR from nasopharyngeal swabs (n = 50).Noninfected individuals were employees at these hospitals who were not in contact with any COVID-19 cases (n = 30).The hospital subjected them to a daily visual triage (temperature and symptoms check).However, they did not have a negative PCR test.Saliva and serum samples were collected at the same time point from individuals and within 48 h of PCRconfirmed COVID-19 diagnosis.Our data collection team conducted the study during working hours (8 AM-2 PM), minimizing variability, and excluding night-time sample collection.
The basic demographic and clinical information (including medical history, medication, periodontal health data, sleep data, weight, height, waist circumference, neck circumference, blood group, respiratory rate, and oxygen supplementation in liters for those on oxygen) of the study participants was obtained.

Saliva collection
Fifteen milliliters of plastic centrifuge tubes were prelabeled with the date and subject number.We then marked the 4 mL line of the tube.A parafilm was used to stimulate saliva.
Prior to sample collection, the saliva collection tube was placed in a cup with ice.A nurse, supervised by the research coordinator, would demonstrate how to provide saliva to the patient.Each subject was instructed to take a sip of water and rinse their mouth, swallow the water, chew the piece of parafilm, and then use their tongue to push saliva as it formed into the tube.They were then instructed to place the tube back in the cup with the ice cube while they waited for more saliva to form.The saliva was collected until it reached the line (4 mL) on the tube, considering the liquid region of the saliva sample (not the foamy regions).Once the patient finished providing the saliva sample, they notified the nurse.The nurse tightened the cap on the tube, wiped it with alcohol, placed the tube in the collection rack in the cooler with ice and discarded the other materials.

Blood collection
Serum and plasma samples were collected using standard venipuncture techniques in 7.5 mL BD Vacutainer Serum marbled topped tubes with clot activator for serum, and plasma was collected in a 5-mL lavender top tube.Samples were collected at the hospital for all samples.

Sample processing
The samples were transferred to the Jaber Alahmad Hospital laboratory in containers with dry ice.The laboratory technician received the samples and processed them on the same day of sample collection within no more than 3 h.Saliva samples were centrifuged at 2,000 × g for 5 min, and the supernatants were separated from the pellets and transferred into different tubes.Plasma and serum samples were allowed to sit upright in racks at room temperature for 30 min prior to centrifuging at 2,000 × g for 10 min at room temperature.All the samples were stored at −80°C and were transferred from Kuwait to JCVI.The samples were placed on dry ice during shipment with a monitoring device to ensure that the samples were frozen during the transfer.

DNA extraction, library preparation, and sequencing
DNA was extracted from samples using Qiagen's AllPrep Bacterial DNA/RNA/Protein Kit (Cat# 47054; QIAGEN, Hilden, Germany) according to the manufacturer's instructions.Step 3 was modified to use MP Biomedicals FastPrep-24 Classic Bead Beating Grinder and Lysis System (Cat# MP116004500; FisherScientific) for 1 min instead of vortexing for 10 min.All recommended products were used, and, at step 2, beta-mercaptoethanol (β-ME) was used instead of dithiothreitol (DTT).
Samples were deposited in NIH SRA under the accession number PRJNA948421.

Cytokine and microbiome statistical analysis
Alpha diversity (i.e.richness) was calculated by summing each sample's detected ASVs.Beta diversity (i.e.hierarchical clustering and PCoA) was calculated by computing the pairwise Aitchison distance, center log-ratio followed by Euclidean distance, of samples and using this distance matrix as input in the Agglomerative and PrincipleCoordinateAnalysis objects of Soothsayer (16) which are wrappers around SciPy (45) and Scikit-Bio, respectively.
PCoA with Jaccard and PhILR (46) alpha diversity, probability density function for Jaccard breakdown (betapart package in R) were conducted via the automated pipeline FALAPhyl: https:// github.com/khalidtab/falaphyl/.Cytokine Net MFI values were used for cytokine measurements.Values were transformed by log10 transformation, then converted into z-scores, and finally plotted with the R package ComplexHeatmap (47).Cytokines with negative MFI values (i.e.lower signal than standard) had the lowest value plus one added to all the cytokine measurements so that the lowest value equals one so that the values can undergo the same log10 transformation and z-score standardization above.Correlations between blood and saliva cytokines were done with Spearman correlation in R, and those found to be statistically significant were then plotted with the R package corrplot (48).
For the abundance paradigm, we ran feature selection models using the following transformations: (i) the quality assessed center log-ratio transformed ASV abundances (multiplicative replacement of (i); and (ii) the z-score normalized cytokine levels.The predictive features that were identified using Clairvoyance on the ASV and cytokine abundances were used to build SSPNs (described below).For the inferred interaction paradigm, the perturbation matrix that is made of all SSPNs was used as input into the final rounds of feature selection.
To avoid performing exhaustive feature selection using every combination of pairwise multimodal associations (∼176 k associations where the majority of associations would contain noninformative features), we identified biomarkers for each modality individually and used the resulting informative features (y1 + y2 feature sets for each modality) as input for the SSPN analysis to produce a perturbation matrix as implemented in Nabwera and Espinoza et al. (16).
Briefly, a perturbation matrix with edge weights of the fully connected SSPNs was fed into a feature selection algorithm to determine the minimal set of edges with the highest predictive capacity.That is biomarkers for COVID-19 as multimodal associations rather than the abundances of specific features.Once the minimal edges predictive of COVID-19 are identified, the Logistic Regression classification model is fit, and the coefficients representing predictive capacity are used as edge weights to construct an aggregate network (Fig. 3).Positive coefficients indicate that an increase in perturbation corresponds with an increased likelihood of a COVID-19 diagnosis, while negative coefficients indicate that an increase in perturbation corresponds with a decreased likelihood of a COVID-19 diagnosis.Ultimately, this procedure resulted in 8 feature selection runs (y1 and y2 classifications for each scenario) that could be run on a local machine: (1-2) ASV abundances; (3-4) Salivary cytokines abundances; (5-6) Systemic cytokines abundances; and (7-8) multimodal perturbations (i.e.SSPN).
For Clairvoyance, we used a logistic regression classifier with the following parameter space: penalty (32) and C {1e-10,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0} with the liblinear solver and 1,000 maximum iterations for convergence.The weights from the fitted classifiers are representative of a feature's predictive capacity, that is, the coefficient of the fitted linear model.Linear models have both positive and negative coefficients.In the case of our models, a positive coefficient for a feature indicates that an increased value of the feature corresponds with an increased likelihood of the query sample to be predicted as the following: (y1) infected relative to uninfected; and (y2) severe relative to mild/ moderate severity.Only feature sets with less than the number of samples (n Samples = 80) were considered for downstream analysis to prevent overfitting.

Multiomic SSPNs
SSPNs were created to identify which inferred interactions were perturbed by adding a query sample to a reference cohort.Cytokine z-score normalized data was propagated from previous analysis as they are not influenced by intra-sample features, while ASV abundance was retransformed using the feature subset identified by Clairvoyance which is influenced by intra-sample features.Pearson correlation was used for both continuous and multimodal interactions (i.e.cytokine-cytokine and cytokine-ASV), while rho proportionality was used for compositional interactions (i.e.ASV-ASV).The reference category was the uninfected class, and in the y1 scenario, we used a dummy class to obtain perturbation measurements that reflect how adding individual reference samples to the larger reference cohort perturbs the associations.SSPNs were implemented using the Sample Specific Perturbation Network object in the Ensemble NetworkX Python package (v2023.2.14) (16, 49) using 1,000 iterations and a sample size of 0.618 with median and median-absolute deviation as the edge reduction statistic and variability measurements, respectively.

Multiomic aggregate networks
The logistic regression classifier used for constructing the aggregate networks was fit using the perturbation of inferred interactions (i.e. the perturbation matrix from the SSPN analysis) with the predictive feature subset identified by Clairvoyance (i.e.multimodal edges in the perturbation matrix) and the COVID-19 severity as classes.As each feature in our classification models represented an edge, we reformatted our fitted models into undirected aggregate networks where each node represents either an ASV or cytokine, each edge represents an inferred interaction, and edge weight represents the predictive capacity of each edge when its associations are perturbed.Salivary and systemic cytokines were kept separate during all analyses.
Node degree is calculated as the sum of absolute values of edge weights connected to a node.Node degree homogeneity is a variant of Pielou's ecological index for evenness calculated via H/llog2 (Number of nodes) where H indicates Shannon entropy (50).

Fig.
Fig.1D).In addition, alpha rarefaction curves demonstrated that sufficient sample sequencing was achieved therefore the effect is not due to insufficient sequencing depth (Fig.S3).To gain insight into the capability of the oral variables in distinguishing COVID-19 status, we used unsupervised clustering (Ward hierarchical clustering) on the samples using their salivary microbiota and the salivary cytokines.The results revealed that the salivary microbiota clustered most of the COVID-19-positive patients along the first tree bifurcation, while the salivary cytokines were not able to show the same distinction (Fig.S4).This finding was reflected in the predictive accuracy of each variable, where the microbiota outperformed the cytokines (Fig.S4).

Fig. 1 .
Fig. 1.Analysis of salivary microbiome between the three groups, control (red), mild/moderate (blue), and severe (yellow).A) Principal coordinates analysis (PCoA) of Jaccard dissimilarity demonstrating the three clusters formed by the microbial constituents.Ellipsoids show the first standard deviation of spread of each group (P < 0.05, ADONIS).B) Violin plots of observed features diversity between the three groups (P < 0.05, Wilcoxon rank-sum test).C) Violin plots of Shannon diversity between the three groups (P < 0.05, Wilcoxon rank-sum test).D) Principal coordinates analysis (PCoA) of Euclidean distances of PhILR distances, demonstrating the three clusters between the conditions.

Fig. 3 .
Fig. 3. Hierarchical feature selection for identifying biomarkers predictive of COVID-19 severity.A) Hierarchical classification of COVID severity and the submodels with relative explained variance of each group derived from first principal component of principal component analysis (PCA).Modality-specific biomarker overlap between submodels of B) abundance-based paradigms and C) inferred-interaction paradigms.In the inferred interaction paradigm, features are edges in the SSPN which can be decomposed into nodes which is not the case for the abundance-based paradigms in (B).Clairvoyance feature selection results for D) submodel y1 and E) submodel y2 color coordinated by modality.Each marker on the scatter plot represents a unique hyperparameter/feature set combination that yields a specific accuracy using 10-fold cross-validation repeated with 10 different random states.

Table 1 .
Clinical characteristics and demographics of the study patients according to disease status and severity.