Two distinct metacommunities characterize the gut microbiota in Crohn's disease patients

Abstract The inflammatory intestinal disorder Crohn's disease (CD) has become a health challenge worldwide. The gut microbiota closely interacts with the host immune system, but its functional impact in CD is unclear. Except for studies on a small number of CD patients, analyses of the gut microbiota in CD have used 16S rDNA amplicon sequencing. Here we employed metagenomic shotgun sequencing to provide a detailed characterization of the compositional and functional features of the CD microbiota, comprising also unannotated bacteria, and investigated its modulation by exclusive enteral nutrition. Based on signature taxa, CD microbiotas clustered into 2 distinct metacommunities, indicating individual variability in CD microbiome structure. Metacommunity-specific functional shifts in CD showed enrichment in producers of the pro-inflammatory hexa-acylated lipopolysaccharide variant and a reduction in the potential to synthesize short-chain fatty acids. Disruption of ecological networks was evident in CD, coupled with reduction in growth rates of many bacterial species. Short-term exclusive enteral nutrition elicited limited impact on the overall composition of the CD microbiota, although functional changes occurred following treatment. The microbiotas in CD patients can be stratified into 2 distinct metacommunities, with the most severely perturbed metacommunity exhibiting functional potentials that deviate markedly from that of the healthy individuals, with possible implication in relation to CD pathogenesis.

metagenomic shotgun sequencing to provide a detailed characterization of the compositional and functional features of the CD microbiota, comprising also unannotated bacteria, and investigated its modulation by exclusive enteral nutrition. Based on signature taxa, CD microbiotas clustered into 2 distinct metacommunities, indicating individual variability in CD microbiome structure. Metacommunity-specific functional shifts in CD showed enrichment in producers of the pro-inflammatory hexa-acylated lipopolysaccharide variant and a reduction in the potential to synthesize short-chain fatty acids. Disruption of ecological networks was evident in CD, coupled with reduction in growth rates of many bacterial species. Short-term exclusive enteral nutrition elicited limited impact on the overall composition of the CD microbiota, although functional changes occurred following treatment. The microbiotas in CD patients can be stratified into 2 distinct metacommunities, with the most severely perturbed metacommunity exhibiting functional potentials that deviate markedly from that of the healthy individuals, with possible implication in relation to CD pathogenesis.
Keywords: Crohn's disease; gut microbe; metagenomics; exclusive enteral nutrition Background Crohn's disease (CD) is an inflammatory bowel disease (IBD) that may affect any part of the gastrointestinal tract. Gut microbes have recently gained much attention as plausible drivers of CD. This notion is supported by the fact that the intimate interaction between the gut microbiota and the intestinal mucosa constantly modulates and shapes the gut immune system [1], and departure from the normal homeostatic microbiome state likely triggers immune dysregulation via pro-inflammatory cues. Specific pathogens that possibly cause CD have been identified, such as adherent-invasive Escherichia coli [2] and Mycobacterium avium paratuberculosis [3]. However, these bacteria were detected only in a fraction of patients [2,3]. It is therefore assumed that the overall composition of the gut microbiota rather than specific microorganisms accounts for the inflammatory state in CD. Studies using 16S rRNA gene amplicon sequencing to characterize CD-associated microbiota abnormalities revealed an overall reduced microbial diversity in CD [1,4,5]. Moreover, a reduction in the relative abundance of Roseburia [6], Faecalibacterium [1,[5][6][7], Bifidobacteriaceae [1], and Clostridiales [5] and an increase in the relative abundance of the Enterobacteriaceae family members [1,[4][5][6][7] were reported in patients with CD. However, studies using 16S rRNA gene amplicon sequencing have limitations in taxonomic resolution and functional inference. Metagenomic shotgun sequencing can overcome these limitations, but only a few studies have applied metagenomic shotgun sequencing on CD microbiota, including the initial report on 4 CD cases (along with 21 ulcerative colitis cases) to illustrate the utility of the first gut microbial reference gene catalog [8], and a subsequent study on 23 pediatric CD patients [9]. The current incomplete understanding of the functional roles played by the gut microbiota has limited the efforts to devise more targeted treatments.
Conventionally, CD is treated with anti-inflammatory or immunosuppressive medications or by surgery if symptoms cannot be improved pharmaceutically [10]. However, side effects and complications such as infection and malnutrition accompany these treatments [11], which imperil the patient's life. Although not widely used, exclusive enteral nutrition (EEN) is a low-risk, noninvasive therapy for CD that involves exclusive ingestion of 100% liquid formula made up of either elemental or polymeric nutrients [12]. In pediatric CD, up to 85% remission has been achieved by EEN [12]. Nevertheless, in adult CD, EEN has not delivered desirable effectiveness, which to some extent may be attributed to nonadherence and interpersonal variations in clinical conditions [12]. The mechanism underlying the alleviation of CD by EEN also remains unclear, though nutritional improvement and microbial involvement possibly play a role [13]. Although previous studies have described the effects of EEN on the microbiota of pediatric CD [9,14], it is unclear how EEN modulates the adult counterpart.
Through metagenomic sequencing and data analysis, we herein provide novel insights into the CD microbiota at both compositional and inferred functional levels. We identified 2 metacommunity stages within CD patients that differed by the abundance of gram-negative pro-inflammatory bacteria and presence of genes involved in the production of antiinflammatory short-chain fatty acids. In addition, we investigated the effect of short-term EEN on the CD microbiota. Our study highlights the presence of 2 microbiota severity states related to gut microbiota dysbiosis in CD and indicates possible functional links between the microbiota and the underlying immunological dysbalance in CD.

Data Description
Forty-nine CD patients and 54 healthy controls (CTs) were enrolled in this study. Fourteen CD patients underwent EEN treatment (for the clinical profiles of CD patients, see Supplementary Table S1). Fecal samples were collected from all participants at baseline and from the EEN-treated patients after 2-week EEN treatment, totaling 117 fecal samples. After DNA extraction, a DNA library of an insert size of 350 bp was constructed and then sequenced on an Illumina HiSeq 2000 analyzer at BGI (Shenzhen, China) using 100 bp paired-end (PE) sequencing. In total, we generated ∼700 Gb raw data, and 672 Gb of the data remained after filtering out low-quality or host reads. The dataset is available from the EBI Database (Accession No. PRJEB15371) [15]. On average ∼55.65 million high-quality reads per sample were generated for further analyses. The proportion of high-quality reads among all raw reads from each sample was 95.98% on average. Using both de novo assembly and alignment against the integrated gene catalog geneset, 2 036 584 genes with an occurrence rate over 5% were obtained.

Clustering of CD microbiota into distinct metacommunities
When the gut microbiotas of CD patients were compared to their non-CD counterparts, both microbial gene counts (Supplementary Fig. S1a) and diversity ( Supplementary Fig. S1b) were considerably lower in CD patients than in CTs. For high-confidence taxonomic identification, co-abundant genes were binned into metagenomics species (MGS; harboring more than 700 genes) [16], which were thereafter used for taxonomic annotation. A total of 452 MGS were identified, with 151 of them being assigned to existing taxonomic entities (Supplementary Table S2).
To capture the principal differences between non-CD and CD microbiome structures, we adopted a combinatory approach  that started with sample clustering based on the dirichlet multinomial mixtures (DMM) model [17], followed by the identification of discriminative microbes using an adapted version of the linear discriminant analysis (LDA) effect size (LEfSe) method [18]. Based on Laplace approximation [17], we identified 3 clusters as exhibiting minimal negative log posterior ( Supplementary  Fig. S1c). Based on this, we clustered the microbiome samples of CD and CTs into 3 metacommunities (A, B, and C), which displayed intracommunity homogeneity and intercommunity dis-similarity (Fig. 1a). The membership of a metacommunity was associated with disease status (Fisher's exact test with BH adjustment, q < 0.01) (Supplementary Table S3). Metacommunity A was dominated by CT samples, and metacommunity C exclusively by CD samples, whereas metacommunity B contained both CT and CD samples (Fig. 1a). Based on a less stringent LEfSe method, 85 MGS were identified as discriminative microbes for the metacommunities or subgroups (CT and CD groups within metacommunity B) ( Fig. 1a and Supplementary Table S4). The majority of metacommunity A-enriched MGS were reduced in metacommunity B and further depleted in C, including shortchain fatty acid (SCFA)-producing bacteria such as Bifidobacterium species, Faecalibacterium prausnitzii, Alistipes shahii, and Roseburia species (Fig. 1a and Supplementary Table S4). Among others, SCFA-producing bacteria Bacteroides cellulosilyticus, Bacteroides xylanisolvens, and Clostridium nexile, a member of the immunomodulatory Clostridium cluster XIVa [19], were enriched in metacommunity B ( Fig. 1a and Supplementary Table S4). Another Clostridium cluster XIVa clade member, Clostridium symbiosum, and a number of opportunistic pathogens such as E. coli, Klebsiella pneumoniae, Streptococcus salivarius, and Clostridium bolteae were overrepresented in metacommunity C ( Fig. 1a and Supplementary Table S4), suggesting that subjects in this group had impaired ability to suppress colonization by pathogenic species in their gut. We also evaluated whether metacommunities differed in the degree of dysbiosis associated with CD through computing the microbial dysbiosis index (MD-index) [5]. CD microbiotas from metacommunity C had significantly higher values on the MD-index than those from metacommunity B (P = 7.63e-05) ( Fig. 1a and Supplementary Table S1), suggesting a more severe degree of dysbiosis in this CD subgroup. Combined, these compositionally distinct metacommunities recapitulate disparate configurations of the microbiota under normal and CD conditions.
The separation of microbiomes into metacommunities was confirmed by principal coordinate analysis (PCoA), which clustered samples by both metacommunity identity and disease status (Fig. 1b). We determined whether the variations in microbiome composition were accompanied with clinical phenotypes. In CD patients, 23 clinical variables together with age correlated with microbiome variation, with uric acid and blood leukocyte numbers being the top 2 covariates (effect size > 0.2) ( Supplementary Fig. S2b). When categorized into groups, various plasma biomarkers, including inflammatory markers, were the strongest classes of covariates (effect size > 0.2) (Supplementary Fig. S2c). However, despite the existence of microbiome variations and their correlation with clinical states, no significant differences were detected for these clinical variables between metacommunity B and C CD patients ( Supplementary Fig. S2d).

CD-and metacommunity-associated functional traits
We next analyzed the functional changes associated with disease status and differences in microbiome structure. We made pair-wise comparisons after performing functional annotation using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (KEGG, RRID:SCR 012773). A large number of CD-and metacommunity-related functional shifts were identified at the level of pathways and modules (Fig. 2a, Supplementary Table S5  and Supplementary Table S6). We observed consistent changes in CD microbiotas in all within-or between-metacommunity comparisons (in B-CD vs A-CT, C-CD vs A-CT, B-CD vs B-CT, and C-CD vs B-CT) (Fig. 2a). The composition of the microbiota of CD patients indicated consistent changes in the potential for carbohydrate utilization compared to the CT counterparts with a decreased abundance of pathways involved in starch and sucrose metabolism, and enrichment of pathways involved in simple carbon metabolism such as fructose, mannose, and galactose in the microbiotas of CD patients (Fig. 2a). In addition, we observed an enrichment of genes in pathways involved in glyoxylate, dicarboxylate, propanoate, and butanoate metabolism, as well as in pathways involved in transport of simple sugars (phosphotransferase system) (Fig. 2a). Interestingly, the reporter scores of numerous amino acid metabolic pathways exhibited marked decreases or increases in CD patients compared to CTs, suggesting possible significant changes in the amino acid metabolic profiles (Fig. 2a). Of note, the potential for methane metabolism was also diminished in CD patients (Fig. 2a). By contrast, microbes in CD patients exhibited enhanced potential for xenobiotic degradation (e.g., of toluene, fluorobenzoate, styrene, benzoate, dioxin, and xylene) and antioxidant defense (e.g., ascorbate, aldarate, and glutathione metabolism) (Fig. 2a). In parallel, a number of pathways associated with pathogenesis and virulence, including ABC transporters, bacterial secretion system, and general lipopolysaccharide (LPS) biosynthesis, exhibited an incremental enrichment from metacommunity A to C (Fig. 2a).
LPS, an inherent component of Gram-negative bacteria, is an endotoxin that can have opposing effects on the immune response [20]. Since pathway and module analyses showed an enrichment of general LPS biosynthesis in the CD microbiome (Fig. 2a), we took a novel approach and investigated the capacity among all Gram-negative bacteria to produce the pro-inflammatory hexa-acylated LPS, as compared to the antagonizing silencing penta-acylated LPS variant [21,22]. We listed bacteria with a potential for synthesizing each LPS variant (Supplementary Table S7) and compared the abundances of these bacteria (Supplementary Table S8). The hexa-acylated LPS-producing bacteria E. coli and Morganella morganii exhibited higher abundance in CD patients from metacommunity C compared to non-CD individuals from metacommunity A (Supplementary Table S7). Consistently, compared to metacommunity A (CT), microbes in metacommunity C (CD) tended to produce LPS in a higher hexa-to penta-ratio, suggested by the increase in abundance of bacteria with the hexa-over the penta-acetylated LPS variant (Fig. 2b), which in part may account for an increased inflammatory stimulation of the CD gut.
The abundances of Gram-positive bacteria were reduced in metacommunity C, and in metacommunity B as compared to CTs (Fig. 2b). These bacteria make up the largest reservoir for production of SCFAs. SCFAs are not only colonotrophic nutrients but also immunoregulatory molecules [23] that may reduce pro-inflammatory cues within the gut environment. We estimated the abilities of the metacommunities to produce the SC-FAs acetic acid, propionic acid, and butyric acid. This was done based on the presence of the genes encoding the last enzyme within the respective biosynthetic pathway, thereby providing an alternative method for predicting the capacity for biosynthesis of the bioactive end products than that used in Fig. 2a, which was based on presence of genes involved in overall metabolic pathways. Bacteria with a potential to produce SCFAs are listed in Supplementary Table S7. Evidently, CD microbiotas, particularly those in metacommunity C, showed a decreased abundance of key genes for SCFA production, including acetic acid, propionic acid, and butyric acid, when compared to the CT microbiota in metacommunity A (Fig. 2c). Concordantly, the abundance of many SCFA-producing bacteria differed between CT and CD samples (Supplementary Table S8). Thus, the gut microbiota in CD patients likely produces a suboptimal amount of SCFAs compared to the healthy state.

Disruption of the normal gut microbial ecosystem and bacterial growth rate in CD
The structure of a microbiota is the result of dynamic interactions between community members. We generated correlationbased microbial interaction networks using the SparCC algorithm ( Fig. 3 and Supplementary Fig. S3). Since   . (b, c) Statistical comparison by Wilcoxon test followed by a Benjamini-Hochberg correction for significance level; * q < 0.2; * * q < 0.1; * * * q < 0.05; * * * * q < 0.001.
(a) (b) Figure 3: Reconstruction of microbial interaction networks by CD. Co-occurrence (blue) relationships and co-exclusion (red) between taxa were estimated by SparCC algorithm, and correlation networks were compared between non-CD samples from metacommunity A (a, A-CT) and CD samples from metacommunity C (b, C-CD).
Only relationships with coefficients above 0.3 are visualized, and the thickness of lines denotes the strength of the correlation, as indicated in the legend. Node size represents mean taxon abundance in networks, and node color represents the growth rate of each species (gray indicates no detection). Taxa of the same bacterial phylum are encircled by dashed lines.
metacommunities A and C were representative of the typical CT and CD states, respectively, we first compared the microbiome networks of these 2 groups ( Fig. 3a and b). The control microbiota in metacommunity A was characterized by a complex network of interactions between different taxa, especially within or between the dominant phyla Bacteroidetes and Firmicutes (Fig. 3a). However, the vast majority of these relationships were no longer significant in the CD patients harboring metacommunity C (Fig. 3b). Among the strong interactions lost in the gut microbiota of the C-CD group were positive correlations (r > 0.5) of Bacteroides cellulosilyticus with Bacteroides thetaiotaomicron and Bacteroides sp., and of Ruminococcus bromii with Eubacterium ventriosum (Fig. 3). Only 1 new strong correlation was formed between 2 unidentified taxa in the C-CD group (Fig. 3). Thus, the CD microbiota of metacommunity C showed not only alterations in composition, but also reduced interrelationships. In comparison, CT and CD microbiotas from metacommunity B did not differ significantly in terms of network complexity, although numerous intertaxon relationships were altered ( Supplementary Fig. S3). Changes in bacterial growth rate may contribute to alterations in community structures. We calculated the growth rate from the number of sequencing reads covering the replication origin relative to reads covering the replication termination site [24]. Compared to CTs in metacommunity A, the growth rate of many beneficial taxa decreased in metacommunity C, including the SCFA-producing bacteria Alistipes finegoldii, Alistipes shahii, Eubacterium rectale, Roseburia intestinalis, and several Faecalibacterium prausnitzii strains ( Fig. 3 and Supplementary Table S9). Interestingly, certain pathogenic or opportunistic pathogenic bacteria exhibiting an increased abundance in the C-CD group showed high growth rates (E. coli, Klebsiella pneumoniae, Bacteroides fragilis, and Streptococcus salivarius) (Fig. 3, Supplementary Fig. S4, and Supplementary Table S9). Thus, differences in growth rate likely contribute to the alterations in the relative abundance of bacteria in CTs and CD patients since the observed increase or decrease in growth rates largely concurred with their changes in relative abundance in CD samples ( Supplementary  Fig. S4). The reduction of growth rates for most bacteria in the C-CD group may also be an indicator that this metacommunity structure is unlikely to shift toward increased diversity over time without specific intervention.

Limited remodeling of CD microbiota composition by short-term EEN
Fourteen patients in our cohort underwent EEN treatment after baseline sampling and provided fecal samples after 2 weeks of treatment. We assessed whether short-term EEN was sufficient to alter the microbiome structure in CD patients. For all patients but 1 (GZCD029, marked by * in Fig. 4b), such short time intervention proved insufficient to change their metacommunity identities (Fig. 4a), in accord with no significant change in MD-indices (P = 0.20) (Fig. 4a and Supplementary Table S1). However, moderate changes occurred, as illustrated by the shift in the relative position of microbiomes along the 2 principal coordinates within pre-identified clusters (Fig. 4b).
Despite the limited remodeling of the overall microbiota composition, 2 weeks of EEN did induce a variety of functional alterations (Fig. 4c, Supplementary Table S12, and  Supplementary Table S13). In a reverse manner to CD-associated shifts, functions such as LPS biosynthesis and the bacterial  Fig. 2c. (d, e) Statistical comparison by Wilcoxon test, followed by a Benjamini-Hochberg correction for significance level, showed no changes between groups.
secretion system became less enriched, while starch and sucrose metabolism and flagellar assembly were enhanced after EEN (Fig. 4c), suggesting a partial functional recovery. However, certain CD-driven changes, such as functions associated with ribosomes, 1 carbon folate pool, phosphotransferase system, and ABC transporters, were exacerbated after 2 weeks of EEN (Fig. 4c), indicating either side effects or temporal disease progression. Nevertheless, short-term EEN did not affect the abundances of LPS-or SCFA-producing bacteria (Fig. 4d and e and Supplementary Table S10) or their growth rates (Supplementary Table S11). However, network rewiring occurred (Supplementary Fig. S5). Rather than interacting with Firmicutes, bacteria from Bacteroidetes tended to interact with each other after EEN treatment (Supplementary Fig. S5). By contrast, a majority of Firmicutes in patients after EEN treatment presented more interdependences with Proteobacteria and unclassified species compared to those before treatment ( Supplementary Fig. S5). Overall, the CD microbiota appeared relatively stable and refrac-tory to 2-week EEN intervention. Future studies will need to determine if a longer intervention period with EEN will result in restoration of normal functional microbiota in CD patients.

Discussion
Comparative metagenomic analysis of fecal samples from CD patients and heathy controls revealed pronounced global alterations in the fecal microbiota of CD patients, characterized by 2 distinct CD metacommunities comprising gradually limited bacterial diversity, by functional aberrations toward a pronounced pro-inflammatory phenotype, and by structural derangements of ecosystem networks. Metacommunities constitute a robust means to distinguish microbiotas with different traits and of distinct natures. Suggested by their signature microbes (the leverage between beneficial bacteria or opportunistic pathogens) and supported by the MD-index, metacommunity A might be representative of the healthy gut, while metacommunities B and C likely represented a moderately imbalanced and a more proinflammatory state associated with CD, respectively. Since the commensal microbiota is closely linked to the health of the host, the classification of metacommunities is a novel promising tool to stratify patients based on their microbiome configuration.
Our study identified systematic functional alterations of the CD microbiome that reflected the stressful microenvironment of the CD gut and its predisposition to inflammation. In this respect, the decline in the potential for the biosynthesis of all SC-FAs, which may modulate the activation of the immune system and temper inflammation [25,26], and the appearance of microbes producing the pro-inflammatory hexa-acetylated LPS [20] are salient manifestations of the inflammation-prone nature of the CD microbiota. Although LPS has long been established as a pathogen-associated molecular pattern that triggers immune cascades [20], it was more recently established that only the hexa-acylated LPS variant is able to activate pro-inflammatory cues via TLR4 in humans [22], while the penta-acylated LPS variant acts as an antagonist [21]. Our finding that the CD microbiota of metacommunity C was enriched in microbes producing hexa-acylated LPS is consistent with previous observations of the increased abundance of the Enterobacteriaceae family members in CD [1,[4][5][6][7], which are known to stimulate inflammation [27]. Together, these changes may severely affect the host immune system, leading to an unchecked inflammatory state in CD. The reduction in the network complexity of the CD microbiota of metacommunity C reinforced the view that a globally disturbed microbial ecosystem may contribute to this disease. The loss of reciprocal and cross-inhibitory relationships may impair the survival of beneficial microbes and create favorable conditions for the blooming of pathogens. Likewise, it appears to limit the growth of many gut bacteria found in healthy individuals. In this regard, reconstruction of the normal ecosystem and not only the mere introduction of a single or several commensal microbes may be needed to curb CD. In the case of EEN, a longer term of treatment may be needed to achieve this goal. Analysis of the fecal microbiota is widely used as a proxy for studying the gut microbiota composition because of the easiness and noninvasive nature of fecal sampling, and it has through the years resulted in deepening the understanding of the relationship between the gut microbiota and IBD [1,28]. However, new avenues of sampling procedures provide more comprehensive insights into the role played by the intestinal location of microbial species (luminal or mucosal layer attachment to the small and the large intestine) that, in combination with metagenomic sequencing, would allow for deeper insights into the interindividual diversity in ecological dysbalance in CD patients in future studies.
Taken together, our metagenome-scale characterization of the CD gut microbiome supports the notion of a shift toward enhanced pro-inflammatory capacity, which is most pronounced in individuals harboring the severe-state metacommunity C. The level of details in this analysis, also encompassing yetunannotated bacteria, may pave the way for elucidating microbial disturbances predictive for CD by enabling the discovery of composite microbial CD biomarkers. In addition, it may allow for the identification of future therapeutic targets based on microbiota signatures, thereby implementing personalized medicine to CD patients based on the individual microbiome composition.

Study cohort, EEN treatment, and sample collection
Forty-nine CD patients and 54 healthy controls were enrolled in this study at the Sixth Affiliated Hospital of the Sun Yat-sen University, Guangdong, China. All patients met the diagnostic criteria for CD, according to the Montreal classification system [29]. Patients diagnosed with diabetes or tumor, cardiovascular, kidney, liver, and metabolic diseases were excluded from this study.
Among these participants, 14 CD patients underwent EEN treatment. ENSURE R (Abbott Laboratories, Abbott Park, IL, USA), PEPTISON R , NUTRISON POWDER R (NUTRICIA, Danone, Netherlands), and FRESUBIN R (Sino-Swed Pharmaceutical Corp. Ltd, China) were used as the standard oral polymeric formulas, and their ingredients are detailed in Supplementary  Table S14. Patients chose from these formulas, with 8 patients selecting ENSURE R and the others selecting a mixture of 2 or more formulas. Formulas were consumed at 30 kcal/kg per day as the sole nutrient source. Patients who adhered to EEN treatment had their lesion healed.
Fecal samples were collected from all participants at baseline (n = 103) and from the EEN-treated CD patients after 2 weeks of treatment (n = 14), totaling 117 samples. The fecal samples were immediately frozen and stored at -80 • C until being processed. DNA extraction was performed according to the protocols described previously [30].

Co-abundance gene group identification and functional annotation
Applying the metagenomic species (MGS) clustering method [16], we clustered genes according to their covariations in abundance across samples. A group of co-abundant genes was identified as an MGS if it contained 700 or more genes. These MGS were subjected to subsequent analysis. Taxonomic assignment of the mapped genes was performed according to the Integrated Microbial Genomes (v. 400) database using an in-house pipeline detailed previously [30], with 70% overlap and 65% identity for assignment to phylum, 85% identity to genus, and 95% identity to species. The relative abundance of a co-abundance gene group was calculated from the relative abundance of its genes.
Differentially enriched KO pathways or modules were identified according to their reporter scores [32], which were calculated from the Z-scores of individual KOs.
We assessed the production capacity for the 2 LPS forms based on the abundances of genes of the entire lipid A biosynthesis pathway, and separated them into penta-acylated LPS producers (harboring all lipid A pathway genes except for LpxM) and pro-inflammatory hexa-acylated LPS producers (all lipid A pathway genes). MGS with no lipid A pathway genes were assigned as Gram-positive bacteria.
Sequences of SCFA-producing enzymes were retrieved as previously described [33]. Genes in the reference gut microbiome gene catalog [8] were identified as these enzymes (best match α-Diversity and gene count α-Diversity (within-sample diversity) was calculated on the basis of the gene profile of each sample according to the Shannon index, as described previously [30]. The total gene count in each fecal sample was determined as in Le Chatelier et al. [34]. Genes with at least 1 mapped read were considered present.

PERMANOVA of the influence of clinical and lifestyle factors
Permutational multivariate analysis of variance (PERMANOVA) [30] was performed on the gene abundance profiles of the samples to assess the effects of each of the factors listed in Table 1. We used Bray-Curtis distance and 9999 permutations in R (3.10, vegan package, R Project for Statistical Computing, RRID:SCR 001905) [35].

Details of updated LEfSe algorithm
Differential abundance analyses were performed using a less stringent LEfSe algorithm to identify feature microbes whose abundances differed at least in 1 comparison [5]. Metacommunities and subgroups in metacommunity B were included for comparisons. The biomarker relevance was ranked according to bootstrapped (n = 30) logarithmic linear discriminant analysis scores of at least 2. The open source R code is available at [36].

Effect size analysis
Twenty-four metadata covariates and their combined effect size when pooled into the broader predefined categories (blood fat, coagulation, inflammation markers, and plasma biomarkers) were estimated with the bioenv function in the vegan R package, which selects the combination of covariates with the strongest correlation to microbiota variation (Pearson correlation between Gower distances of covariates and microbiome Bray-Curtis dissimilarity) (Supplementary Fig. S2A).

Correlation network inferred by phylogenetic marker genes
Eighty-five MGS, which were previously selected via the detection of microbial community clusters through DMM modeling, were subjected to compositionality data analysis using the SparCC algorithm [37]. Taxon-taxon correlation coefficients were estimated as the average of 20 inference iterations with a strength threshold of 0.25. Correlations with corresponding empirical P-values of less than 0.01 were retained, which was calculated via a total of 10 000 simulated data sets. This set of iterative procedures was applied separately to data from CTs and CD patients, and to patients' data before and after EEN to infer the correlation values. Correlation coefficients with magnitude of 0.3 or greater were selected for visualization in Cytoscape (v. 3.3.0, Cytoscape, RRID:SCR 003032).

Additional files
Supplementary Table S1: Clinical characteristics of participants and the dysbiosis index for their gut micorbiome.
Supplementary Table S2: MGS of the IBD cohort. Clusters containing >700 genes were annotated according to available bacteria and archaea genomes, as was described previously [30].
Supplementary Table S3: Association between metacommunity and CD status. P-values from Fisher's exact tests were adjusted by Benjamini-Hochberg step-up procedure.
Supplementary Table S4: Results of differential abundance analysis on signature MGS for metacommunities. An adapted version of the LDA effect size method was applied for selecting differential MGS. Those with an LDA score over 2 were visualized in Fig. 1 and Supplementary Table S4.
Supplementary Table S5: Summary of differential abundance analysis on KEGG pathways between subgroups. Differentially enriched KO pathways were identified according to their reporter scores.