A Pilot Integrative Analysis of Colonic Gene Expression, Gut Microbiota, and Immune Infiltration in Primary Sclerosing Cholangitis-Inflammatory Bowel Disease: Association of Disease With Bile Acid Pathways

Abstract Background Although a majority of patients with PSC have colitis [PSC-IBD; primary sclerosing cholangitis-inflammatory bowel disease], this is phenotypically different from ulcerative colitis [UC]. We sought to define further the pathophysiological differences between PSC-IBD and UC, by applying a comparative and integrative approach to colonic gene expression, gut microbiota and immune infiltration data. Methods Colonic biopsies were collected from patients with PSC-IBD [n = 10], UC [n = 10], and healthy controls [HC; n = 10]. Shotgun RNA-sequencing for differentially expressed colonic mucosal genes [DEGs], 16S rRNA analysis for microbial profiling, and immunophenotyping were performed followed by multi-omic integration. Results The colonic transcriptome differed significantly between groups [p = 0.01]. Colonic transcriptomes from HC were different from both UC [1343 DEGs] and PSC-IBD [4312 DEGs]. Of these genes, only 939 had shared differential gene expression in both UC and PSC-IBD compared with HC. Imputed pathways were predominantly associated with upregulation of immune response and microbial defense in both disease cohorts compared with HC. There were 1692 DEGs between PSC-IBD and UC. Bile acid signalling pathways were upregulated in PSC-IBD compared with UC [p = 0.02]. Microbiota profiles were different between the three groups [p = 0.01]; with inferred function in PSC-IBD also being consistent with dysregulation of bile acid metabolism. Th17 cells and IL17-producing CD4 cells were increased in both PSC-IBD and UC when compared with HC [p < 0.05]. Multi-omic integration revealed networks involved in bile acid homeostasis and cancer regulation in PSC-IBD. Conclusions Colonic transcriptomic and microbiota analysis in PSC-IBD point toward dysregulation of colonic bile acid homeostasis compared with UC. This highlights important mechanisms and suggests the possibility of novel approaches in treating PSC-IBD.


Introduction
Primary sclerosing cholangitis [PSC] is characterised by progressive inflammation and fibrotic stricturing of the biliary tree. 1 The pathogenesis of PSC is poorly understood, with no medical treatments, but is presumed to result from the complex interplay of immune dysregulation, gut microbial dysbiosis, and changes in bile acid homeostasis, in genetically predisposed individuals. [2][3][4] PSC is highly comorbid with inflammatory bowel disease [IBD], which is ultimately diagnosed in approximately 75% of patients, primarily resembling ulcerative colitis [UC]. PSC-IBD is phenotypically different from UC, being more likely a pancolitis [especially right-sided] that follows a milder disease course but is associated with a significantly higher risk of colorectal cancer.
Genome-wide association studies [GWAS] have identified multiple shared and non-shared genetic loci that underlie the risk of developing PSC-IBD. [5][6][7] A substantial component of the genetic architecture of PSC-IBD is not shared with UC, and genetic correlation modelling generates a UC comorbidity rate of only 1.6% in patients with PSC. 5 Furthermore, patients with PSC-IBD have distinct gut microbial profiles compared with UC. [2][3][4] These differences have led to the proposal that colonic inflammation in PSC-IBD and in UC results from different pathways. The colonic mucosal gene expression profile [transcriptome] in PSC-IBD, has not previously been explored.
To appreciate the complex and interdependent mechanisms underlying the colonic presentation of PSC-IBD, we applied a comparative systems biology approach, capturing the colonic mucosal transcriptome, mucosally adherent gut microbiota profiles, and mucosal immunophenotype in patients with PSC-IBD, UC alone, and healthy controls [HC] in this pilot study. Through this approach, we modelled interactions between different biological systems to interrogate key processes driving the phenotypes observed in PSC-IBD and UC.

Study population and sample collection
Patients with PSC-IBD, with UC, and HC were recruited from clinic and endoscopy lists. PSC-IBD and UC were documented in keeping with European guidelines on diagnosis [EASL and ECCO]. Healthy subjects had no known comorbidities and normal colonoscopy [other than haemorrhoids] as part of investigation for rectal bleeding. Subjects were excluded if they had taken antibiotics and/or probiotics in the past 3 months. Only patients with large-duct PSC were recruited as part of the PSC-IBD cohort, and alternative aetiologies were excluded for all patients. Recurrent PSC was documented in transplant patients, and secondary causes were excluded by standard clinical workup. Colonic mucosal biopsies were taken from the sigmoid colon and collected on ice, in both Qiagen RNAlater tubes [for storage in -80 C] and Miltenyi Biotec gentleMACS C-Tubes containing complete RPMI media [for immediate processing]. Ethical approval was given by University of Birmingham Human Biomaterials Resource Centre [HTA Licence 12,358]. An overview of the methodology for this study is summarised in Supplementary  Figure 1, available as Supplementary data at ECCO-JCC online.

Differential gene expression analysis
Reads obtained were quality controlled with FastQC and Trimmomatic. 8,9 Contaminating ribosomal RNA reads were removed using Bowtie2, and reads were then mapped to the human genome sequence database [GRCh38] using STAR and quantified with featureCounts. [10][11][12] Genes were filtered, and differential gene expression was analysed based (false discovery rate [FDR] corrected, p ≤0.05] using edgeR. 13 Gene ontology and Kyoto Encyclopaedia of Genes and Genomes [KEGG]/Reactome pathway analysis was conducted using Camera for competitive gene set testing. 14 ClueGo was used to functionally group gene ontology and pathway annotation networks. 15

Computational cell deconvolution
xCell, a computational method for cell deconvolution, was used to compare cell types based on genetic signature and to assess for any differences in cell populations derived from mucosal tissue between the three cohorts. 16 18 Taxonomy was assigned against the Silva-132-99% OTUs database. 19 Differences in relative abundance of taxa between cohorts were analysed using linear discriminant analysis [LDA] effect size [LEfSe]. 20 Only taxa with LDA > 2 at a p-value <0.05 were considered significant. The functional profiles of microbial communities were inferred using PICRUSt2 derived relative MetaCyc and KEGG pathway analysis and assessed using STAMP. 21,22

Predictive and network analytics
We used the Random Forest [RF] machine learning ensemble method to obtain predictive performance of features from the transcriptomics, immunophenotype, and 16S rRNA microbial profiling datasets. 23 Selected genes were mapped to functional information from three databases: IntAct, KEGG, and TRRUST. Network analysis was performed using qgraph package. 24 A full description and explanation of this methodology are provided in the Supplementary Methods.

Patient demographics
Thirty patients were recruited into this exploratory study-10 healthy controls, 10 patients with PSC-IBD, and 10 with UC. The patients between the three groups were matched for age and gender ratio. Patients with PSC-IBD and UC were matched for colitis characteristics, with no significant differences in disease extent, Mayo endoscopic sub-score, and medication use. Three patients with PSC-IBD were post-transplant [with recurrence of PSC] and three were on ursodeoxycholic acid [UDCA]. Detailed demographics are shown in Table 1.

Quality control
An average of 38 million PE reads [+/-8.3 million] were generated per sample. Following quality control and in silico decontamination of ribosomal RNA reads, an average of 35 million reads were mapped for subsequent gene expression analysis.

Pathway analysis
We performed pathway analysis to assess for enrichment of particular gene annotation categories amongst the differential gene expression datasets [full list provided as Supplementary File 2, available as Supplementary data at ECCO-JCC online].

PSC-IBD compared with UC
Gene ontology analysis comparing PSC-IBD with UC demonstrated significant enrichment of 563 biological processes. Of these, 104 were upregulated in PSC-IBD and were associated with fatty acid metabolic processes, glucuronidation, bile acid and bile salt metabolism processes, and transport. Processes such as those associated with immunological response were downregulated compared with UC. KEGG/Reactome analysis revealed differential regulation of 238 pathways in PSC-IBD compared with UC [62 upregulated] with findings similar to gene ontology analysis. Additionally, pathways associated with bile acid homeostasis including PPAR signalling, nuclear receptor transcription, and bile acid recycling, were upregulated in PSC-IBD compared with UC [ Figure 3]. Pathways associated with DNA damage response, telomere maintenance, transition of cell cycle phases, and cell replication were downregulated in PSC-IBD. We explored the effects of specific confounders on DEGs in PSC-IBD. In patients with a liver transplant [n = 3] who were also on tacrolimus, only REG3A, DEFA5, DEFA6, SNORA66, and PRSS2 were upregulated and VSIG4 downregulated. In patients with UDCA [n = 3], only CYP3A4 was differentially expressed

Cell deconvolution
Computational cell deconvolution showed that only the dendritic cell subset was increased in PSC-IBD compared with HC

Predicted metagenomic pathways in PSC-IBD compared with UC
Metacyc pathway analysis of the microbiome functions inferred from 16S rRNA gene sequence profiles revealed an increase in pathways such as glycolysis and mannan degradation. The predicted KEGG pathways significantly enriched in PSC-IBD compared with UC included primary bile acid biosynthesis [associated with significantly higher expression of bile salt hydrolase and hydroxysteroid dehydrogenases] and pentose and glucoronate interconversions.

Integrating transcriptomics, immunophenotype, and 16S rRNA microbial profile
Genes differentially expressed between PSC-IBD and UC were selected based on AUC values, maximum connectivity, and biological relevance. 16S rRNA microbial profiling was selected based on linear discriminant analysis [LDA] effect size [LEfSe] more than log [10] value of greater than or less than 3, between PSC-IBD and UC. Th17 and IL17 were selected as they were significantly different in PSC-IBD and UC compared with HC. As shown in Figure 6, through multi-omics integration analysis we observed two network clusters, involving genes associated with bile acid homeostasis and genes associated with cancer regulatory pathways, which were significant between PSC-IBD and UC.

Discussion
It is well recognised by clinicians that the clinical phenotype of PSC-IBD is different from UC. This, along with differences in genetic risk, collectively support the prediction that biological pathways related to disease behaviour should differ. Our work previously demonstrated that the mucosally adherent gut microbial profiles were different between patients with PSC-IBD, those with UC, and healthy controls. Our current study presents data from an independent cohort of patients to explore the gene expression, immunophenotype, and microbial profiles in the colonic mucosa. Using an integrative systems biology approach in this pilot study, we demonstrate significant differences between the colonic mucosal landscape in patients with PSC-IBD and UC, thereby highlighting opportunities to harness distinctions in biology to inform future therapy. Our comparative analysis of mucosal transcriptome in colitic patients and healthy controls confirms that the colonic phenotype in both PSC-IBD and UC is primarily immune-mediated, with upregulation of pathways driving multiple facets of the innate, adaptive, and humoral immune response. Furthermore, genes and pathways associated with recognised biological mechanisms, including anti-microbial defense response and extracellular matrix remodelling, are also differentially upregulated in comparison with healthy controls. In our transcriptomic analysis, PSC-IBD had very distinct mucosal transcriptomic profile compared with UC. Only 939 genes had shared differential expression in PSC-IBD and UC compared with healthy controls. These genes amount to 70% of DEGs between PSC-IBD and HC and 22% of DEGs between UC and HC. Importantly, 1692 genes were differentially expressed between PSC-IBD and UC. Analysis of the genes and pathways associated with this differential expression highlights key differences in physiological processes, with enrichment of biological processes involved in bile acid homeostasis, glucuronidation, and specific metabolic functions in PSC-IBD compared with UC. 25 Bile acid regulation is mediated through a negative feedback mechanism as part of its enterohepatic recirculation, by hormones such as fibroblast growth factor [    As bile acids are inherently cytotoxic, these nuclear receptors-in particular FXR, a bile acid-activated transcriptional factor-may be activated in order to induce gene expression circuitry to protect against bile acid toxicity, by shutting down expression of genes that increase influx and synthesis of bile.  O1  O2  O3  O4  O5  O6  O7  O8  O9  O10  O11  O12  O13   c__Gammaproteobacteria  f__Enterobacteriaceae  o__Enterobacteriales  f__Prevotellaceae  f__Paraprevotellaceae  f__Myxococcales 0319-6G20  s__Myxococcales sp  c__Lentisphaeria  o__Victivallales  p__Lentisphaerae  s__Actinomyces  s__stutzeri  o__Alteromonadales   O14  O15  O15  O17  O18  O19  O20  IL17  Th17  g1  g2  g3  g4 g-Shewanella f-Shewanellaceae c-Bacilli s-Roseburia s-Parvimonas g-Parvimonas s-fragilis IL17  Th17  ABCB11  UGT1A10  UGT1A9  UGT1A7   g5  g6  g7  g8  g9  g10  g11  g12  g13  g14  g15  g16  g17   UGT1A6  UGT1A1  PPARG  SLC51A  GBA3  CXCL1  ABCG2  PITX2  TNIP3  PCSK1  FABP6  TUBB2A  ABCB1   g18  g19  g20  g21  g22  g23  g24  g25  g26  g27   CALU  TSTA3  ASS1  USP2  TRIM29  NR1H4  SLC51B  SULT1A2 CCL13 ABCC3 basolateral membrane. [26][27][28] 29 Although the expression of FGF19, a key hormone produced in response of FXR activation, was unchanged, the MEP1β gene [meprin A subunit-β] produced in response to FGF19 was highly upregulated in PSC-IBD. 30 This FXR/PPAR-mediated protective positive feedback effect, possibly as a consequence of increased intracellular bile acid concentrations, functions to accelerate neutralisation, binding, and its elimination in order to reduce intracellular colonic bile acid toxicity. 27,31,32 It has been shown that the faecal bile acid pool is significantly reduced in PSC-IBD, and it is unclear whether this is due to reduced delivery into the colon related either to cholestasis or increased bile acid clearance. 2 In this study, we have confirmed from mucosal analysis previous findings by our group and others, 2-4 that the gut microbiota profile is different between PSC-IBD, UC, and HC. The dysbiosis seen in PSC-IBD appears to be associated with changes in bile acid metabolic pathways, which is congruent with our transcriptomic and pathway analysis. Patients with PSC-IBD had significantly higher abundances of the species Bacteroides fragilis, Roseburia spp., Shewanella sp.p and Clostridium ramosum compared with UC patients. These bacteria express bile salt hydrolase [BSH], an enzyme that catalyses the deconjugation of bile acids in the gut. 33 These bacteria have also been shown to express hydroxysteroid dehydrogenases that are involved in biotransformation of primary to secondary bile acids. 34 Inferred metagenomics corroborated enrichment of BSH and hydroxysteroid dehydrogenases in PSC-IBD gut microbiota compared with UC [and HC] in our dataset. Vancomycin, an antibiotic that specifically targets Gram-positive bacteria [many of which are involved in dehydoxylation of primary bile acids into secondary bile acids] has been shown to induce remission of colitis in patients with PSC-IBD. 35 Furthermore, amine oxidase-expressing bacteria, Sphingomonas sp., were found to be upregulated in PSC-IBD compared with UC. This enzyme is associated with aberrant homing of gut lymphocytes to the liver, a proposed mechanism underlying the PSC-IBD gut-liver inflammatory axis. 36 Our study demonstrates for the first time that the colonic mucosal immune response in PSC-IBD is characterised by a significantly higher Th17 cell and IL-17-producing CD4 T cell population compared with HC. These findings were similar to those seen in the UC cohort; however, patients with PSC-IBD also have a lower Th1 and higher IL17/IFNγ-producing CD4 cell population compared with controls. IL17-producing Th17 cells are often present at sites of chronic tissue inflammation in multiple autoimmune diseases. 37 Th17 cells are critical drivers of inflammation in autoimmune diseases, and are likely to be induced by specific components of gut microbiota. 38 The role of bile acids in directly mediating host immunity by controlling Th17 response has been shown to be associated with Th17 expansion and IL17 production. 39 We found significant upregulation of CYP27A1 expression in PSC-IBD compared with UC. This enzyme generates oxysterols, which are key intermediates for bile acid synthesis and function as RORγt ligands to drive Th17 differentiation. 40 Genes and pathways associated with cancer regulation, including DNA damage repair and checkpoints, p53 signalling, mitosis transition, and APC/CCdc20 mediated Cyclin A degradation, were significantly downregulated in PSC-IBD compared with UC. Network analysis revealed the gene TUBB2A to be a potential key mediator in PSC-IBD compared with UC. We have previously shown methylation of a component of this gene [TUBB6] significantly positively correlated colonic dysplasia. 41 Interestingly, p-ANCAs in autoimmune liver diseases are reported to be directed against a further component of the beta-tubulin family, TUBB5, which cross-reacts with the bacterial protein FtsZ, reflecting an abnormal immune response to gut bacteria. 42 PRAC1 gene was found to be 8 log-fold downregulated in PSC-IBD, and expression of this gene is associated with reduced susceptibility for right-sided cancers. 43 One of the major strengths of our study is that this is the first multi-omics to date which has attempted to unravel disease mechanisms by integrating mucosal transcriptomics, immunophenotyping, and mucosal microbial profiling. Through this approach, we have demonstrated that consistent with clinical phenotype and GWAS datasets, the biological mechanisms underlying colonic pathology appear to be distinct in PSC-IBD compared with UC. We have made novel observations and proposed disease processes that need validation through e -vivo experiments. Through cell deconvolution, we were able to ascertain that the transcriptomic findings were not a result of significant differences in cell subset populations apart from a slight increase in dendritic cells in PSC-IBD compared with HC.
This technique does have limitations, and future studies should strongly consider using single cell RNA-sequencing. Moreover, a significant number of genes that we have explored are likely to have multiple unknown or less established functions, thereby limiting the pathway analysis approach. Power calculations for RNA-sequencing experiments are not fully established; however, between six and 12 biological replicates have been recommended. 44 Nevertheless, we accept that a lack of validation cohort as part of this analysis is a limitation of this study. We adopted strict exclusion criteria to address potential confounders and, despite our relatively small sample size, we were able to infer significant biological differences with high confidence between the three cohorts, including through predictive modelling. 45 As shown in Table 1 patients with PSC-IBD and UC were matched for IBD characteristics. All patients with PSC-IBD and eight patients with UC had pancolitis [and the remaining two had left-sided colitis]. Therefore, by analysing biopsies taken from the sigmoid colon, we attempted to control for biological changes associated with anatomical location and inflammation. Furthermore, in our previous study we found that the microbial profiles did not differ between different segments of the colon within patients with PSC-IBD. 4 Our data are pilot in nature and we must recognise the challenge of confounding from clinical heterogeneity. We did not find specific confounders, which included liver transplantation and immunosuppression including biologics, to alter differential gene expression results, in particular bile acid signalling genes. Through a subgroup analysis of pre-transplant PSC-IBD patients, as outlined in Supplementary Results, we were able to demonstrate that the key findings of changes in bile acid homeostatic and immunological pathways were not different in comparison with the PSC-IBD cohort that included three post-liver transplant patients with recurrence of PSC. However, we appreciate that these subgroup analyses are limited by the sample size. We recognise that UCDA use in three patients with PSC-IBD can modulate intestinal bile acid homeostatic pathways; however, only the CYP3A4 gene was differentially regulated in PSC-IBD. Furthermore, UDCA has not been shown to activate FXR expression, but rather may antagonise it as demonstrated by a pharmacodynamic study and FXR binding experiments. 46 It is therefore seems unlikely for UDCA to have had an impact on the gene expression profiles seen in PSC-IBD. Although the median bilirubin in PSC-IBD was 24.5μmmol/L, only two patients were jaundiced [bilirubin <80μmmol/L] one of whom had Child_Pugh A cirrhosis with no differential gene expression as a consequence of this. We also recognise that cause or effect of cholestasis would be difficult to establish from our data, as the very nature of PSC means cholestasis is its primary manifestation. However, identifying a control arm that consisted of UDCA-naïve patients with cholestatic liver disease and without colonic inflammation, undergoing colonoscopy, was in essence not feasible for this pilot dataset. Furthermore, it would be important to validate these findings through an independent cohort along with targeted gene expression analysis using a qPCR-based or 3'-seq-based approach. Finally, it remains to be established how the colon compares with ileal bile acid homeostatic mechanisms in health and disease, and the role of diet as a modifier of the PSC-IBD transcriptome, epigenome, and microbiome. Future studies should strongly consider dietary evaluation alongside investigating tissue, faecal, and serum bile acid profiles.
In conclusion, by using a systems biology approach, we demonstrate that the colonic inflammation in PSC-IBD, like UC, is immune-mediated when compared with healthy controls, and presents as a predominant Th17-and IL17-producing CD4 cell response. PSC-IBD, however, is transcriptomically distinct from UC, with dysregulation of genes associated with multiple bile acid homeostatic pathways, potentially mediated by the corresponding differences in gut dysbiosis demonstrated between the IBD cohorts studied. These findings support the hypothesis that colonic mucosal immune-mediated inflammation in PSC-IBD is contributed to by colonic mucosal bile acid toxicity. Further work is required to understand whether the dysregulated bile acid metabolism in PSC-IBD is driven by the gut microbiota or vice versa, and how host genetics plays into this interaction. This preliminary work thus informs potential future therapeutic approaches for PSC-IBD that include novel bile acid/microbial manipulation.