Deterministic and stochastic processes generating alternative states of microbiomes

Abstract The structure of microbiomes is often classified into discrete or semi-discrete types potentially differing in community-scale functional profiles. Elucidating the mechanisms that generate such “alternative states” of microbiome compositions has been one of the major challenges in ecology and microbiology. In a time-series analysis of experimental microbiomes, we here show that both deterministic and stochastic ecological processes drive divergence of alternative microbiome states. We introduced species-rich soil-derived microbiomes into eight types of culture media with 48 replicates, monitoring shifts in community compositions at six time points (8 media × 48 replicates × 6 time points = 2304 community samples). We then confirmed that microbial community structure diverged into a few state types in each of the eight medium conditions as predicted in the presence of both deterministic and stochastic community processes. In other words, microbiome structure was differentiated into a small number of reproducible compositions under the same environment. This fact indicates not only the presence of selective forces leading to specific equilibria of community-scale resource use but also the influence of demographic drift (fluctuations) on the microbiome assembly. A reference-genome-based analysis further suggested that the observed alternative states differed in ecosystem-level functions. These findings will help us examine how microbiome structure and functions can be controlled by changing the “stability landscapes” of ecological community compositions.


Introduction
Understanding the mechanisms by which microbial community structure is organized is a major challenge in ecology and microbiology [1][2][3][4].In a general framework of community dynamics, selection, diversification (speciation), dispersal, and drift have been considered as fundamental components of community processes [5,6] (Fig. 1).Among the four component processes [5,6], selection and drift can be regarded, respectively, as purely deterministic and stochastic components [7,8], while dispersal and diversification are considered to include both deterministic and stochastic components [8].Elucidating how those deterministic and stochastic processes collectively organize ecological community assembly is the key to predict and manage microbiome dynamics in diverse fields of applications such as human-gut microbiome therapies [9][10][11][12][13] and agroecosystem microbiome control [14][15][16].
In empirical studies of ecological communities, consequences of deterministic and stochastic ecological processes are observable as "alternative states" of community structure [3,[17][18][19].Pioneering studies on human-gut microbiomes have shown that microbiome structure can be classified into semi-discrete (partially continuous) clusters despite potential numerous combinations of species or taxonomic compositions [20][21][22][23].Although clear classification of microbiome compositions is impossible [23][24][25][26], such observations of community structure highlight the presence of strong deterministic processes organizing community structure as represented by the "stability landscape" concept of species assembly [27].Meanwhile, the presence of alternative community compositions suggests that stochastic processes play some roles in the community differentiation.In other words, because no variation in community structure is expected under the strict assumption of deterministic processes, both deterministic and stochastic processes are necessary for the existence of categorizable community compositions [18,28] (Fig. 1).Thus, empirical studies on community structure give essential insights into ecological community assembly.Nonetheless, it is basically difficult to develop detailed discussion on ecological community processes based on existing microbiome datasets because potential inf luence of environmental conditions on community structural patterns cannot be fully understood in observational studies.
In this respect, experimental studies with fully controlled environmental conditions are expected to provide ideal opportunities for examining ecological community processes in light of theories on alternative states [6,8,19].By making a number of experimental microbial communities with defined Figure 1.Conceptual framework of community assembly; (A) stability landscape of community structure; community assembly is often discussed based on the schema of "stability landscapes," which represent the stability/instability of community states (i.e.species or taxonomic compositions); alternative states are defined as bottom positions within basins on the stability landscapes; (B) deterministic and stochastic processes in community assembly; our experiment was designed to test the contributions of deterministic and stochastic ecological processes; specifically, the results of our multireplicate microbiome experiments are expected to depend on the presence/absence of selection and drift; without drift (demographic f luctuations), all replicate communities will converge to a single community state (bottom panels); with drift, differentiation of community states can occur depending on the structure of stability landscapes; in particular, if there are two or more basins, differentiation into a small number of alternative community states will be observed as a consequence of both selection and drift (top right).
environmental conditions, we can perform strict tests of the emergence of alternative community states.In other words, the presence of multiple reproducible states in the same experimental treatment is interpreted in the framework of selection, diversification, dispersal, and drift.Despite the potential contributions of the experimental approaches to our fundamental knowledge of community assembly, few attempts (but see Estrela et al. [ 29]) have been made to explore alternative states of microbiomes with tens of replications.
In this study, we examined the presence of alternative community states by performing microbiome experiments under eight nutritional (medium) conditions with 48 replicates.We constructed experimental microbiomes using a forest-soilderived community of prokaryotes as a source microbiome and then kept the 384 microbial communities (8 treatments × 48 replicates) under a fully controlled temperature condition.The experimental system was designed to examine the roles of selection and drift in microbial community assembly.With the aid of an automated pipetting system equipped in a clean laboratory environment, we monitored changes in microbiome community compositions every 2 days for 12 days based on the DNA metabarcoding of 16S rRNA gene sequences.The analysis of more than 2000 community samples then allowed us to understand how deterministic and stochastic processes could generate alternative states of ecological community structure.By exploring statistical ways for quantitatively evaluating the distributions of community compositions within state space, we extended discussions on alternative microbiome states in terms of the interplay of deterministic and stochastic ecological processes.

Terminology
In analyzing the data obtained in this study, we need to use consistent terminology to minimize the risk of confusion and misunderstanding.It is necessary to confirm the definitions of the ecological processes whose meaning can change depending on contexts.Specifically, we use the terms "selection," "drift," "dispersal," and "diversification (speciation)" in ecological processes as conceptualized by Vellend [5].Among the terms, selection represents expected changes in local community compositions resulting from differences in mean fitness between species, constituting purely deterministic processes [7].Drift refers to demographic f luctuations that occur regardless of among-species difference in mean fitness, forming purely stochastic processes [7].On the other hand, dispersal itself is not a term representing stochasticity or determinism, but it merely describes processes by which organismal individuals or their propagules move between local communities.Diversification is defined as evolutionary differentiation of genetic variants and hence it represents both stochastic (e.g.nucleotide mutation) and deterministic (i.e.natural selection) phenomena [3,8].
Another important term we need to make clear before describing our microbiome experiment is "alternative states."In ecology, the term "alternative stable states" is often used to discuss the processes by which divergence of community compositions is caused [17,19,30,31].Meanwhile, it is generally difficult to know whether the observed community compositions are at stable states (i.e.equilibria) or they are in transient processes toward stable states [32,33].Stability of communities has been a central topic in the history of community ecology [34,35].In this study, however, we investigate the divergence of community compositions irrespective of the concept of community stability.In other words, our community dataset can include information of both transient and stable states.

Culture experiments of microbiomes
In the microbiome experiment, the source microbiome derived from the soil of the A layer (0-10 cm in depth) in the research forest of the Center for Ecological Research, Kyoto University, Shiga, Japan (34.972 • N; 135.958 • E).After sampling, the soil was sieved with a 4-mm stainless mesh and then 5 g of the sieved soil was mixed in 100 ml phosphate-buffered saline with cycloheximide (137 mM NaCl, 8.1 mM Na 2 HPO 4 , 2.68 mM KCl, 1.47 mM KH 2 PO 4 , and 200 μg/ml cycloheximide).In this process, we added cycloheximide in order to exclude eukaryotes from the source microbiome.The source prokaryote microbiome was cultured at 22 • C for 48 h.Experimental design; laboratory culture system; a source microbiome deriving from forest soil was precultured with cycloheximide under room temperature for 2 days in order to remove eukaryotes; the microbiome inoculum was then introduced into eight types of media (Table S1 ) with 48 replicates; a fraction of the culture f luid was sampled every 2 days, and equivalent volume of fresh medium was added to the continual culture system throughout the 12-day experiment (8 media × 48 replicates × 6 time points = 2304 community samples).
We introduced the inoculum microbiome into eight types of media with different constitutions of carbon sources with 48 replicate communities per medium type (in total, 8 media × 48 replicates = 384 experimental communities; Fig. 2).To make the compositions of the media as simple as possible, we used M9 medium with minimal inorganic additives and combinations of three types of the carbon resources, specifically, glucose, leucine, and citrate as detailed in Table S1.We selected the medium systems with different combinations of glucose, leucine, and citrate because all the three carbohydrates have six carbons in their molecules, facilitating the adjustments of molarity in the experimental treatments.In the microbiome culture experiment, each of the eight media was designed based on the concentrations or presence/absence of the three carbon sources: i.e. low or high concentration of glucose (G/HG), with or without leucine (−/L), and with or without citrate (−/C).Hereafter, these medium types were designated as Medium-G (low glucose, without leucine, without citrate), GL (low glucose, with leucine, without citrate), GC (low glucose, without leucine, with citrate), GLC (low glucose, with leucine, with citrate), HG (high glucose, without leucine, without citrate), HGL (high glucose, with leucine, without citrate), HGC (high glucose, without leucine, with citrate), and HGLC (high glucose, with leucine, with citrate), respectively (Table S1).
In each well of a 240-μl deep-well plate, 10 μl of the diluted source microbiome solution and 190 μl of medium were installed.Based on the quantitative amplicon sequencing detailed below, the source microbiome solution was estimated to contain 3.12 × 10 6 DNA copies of the 16S rRNA gene (SD = 8.37 × 10 5 ) (i.e. the density of the 16S rRNA gene in the diluted source inoculum was 3.12 × 10 5 copies/μl; Fig. S1).Note that a previous research estimated that a single cell of Enterobacteriaceae bacteria has an average of 7.3 DNA copies of 16S rRNA genes (SD = 0.9; calculated with rrnDB; [36]), although variation in 16S rRNA gene copy numbers among prokaryote taxa has been known [36].The deep-well plate was kept shaken at 200 rpm using a plate thermo-shaker BSR-MB100-4A (Bio Medical Sciences Co. Ltd, Tokyo) at 30 • C for 2 days.After 2 days of incubation, 190 μl out of the 200-μl culture medium was sampled from each of the 48 wells after mixing (pipetting) every 2 days for 12 days.All pipetting manipulations were performed with high precision using an automatic pipetting machine (EDR-384SR; BIOTEC Co. Ltd, Tokyo) placed in a laminar f low cabinet.In each sampling event, 190 μl of fresh medium was added to each well so that the total culture volume was kept constant.In total, 2304 samples (384 communities/day × 6 time points) were collected.
Note that our study was not designed to examine historical contingency (priority effects) because the microbial species were simultaneously introduced into the experimental media.Dispersal between replicate communities was prohibited, and diversification leading to speciation events was unlikely to occur in the 12-day experiment.Thus, our aim in this study was to examine the contributions of selection and drift to microbiome dynamics (Fig. 1).

DNA extraction
To extract DNA from each culture sample, 5 μl of the collected aliquot was mixed with 1 μl lysozyme solution (50 mg/ml lysozyme [Sigma], 20 mM Tris-HCl [pH 8.0], 2 mM EDTA), and the mixed solution was incubated at 37 • C for 2 h.After adding proteinase K solution (1/30 [v/v] Proteinase K [Takara], 20 mM Tris-HCl [pH 8.0], 2 mM ethylenediaminetetraacetic acid (EDTA)), the aliquot was incubated at 55 • C for 3 h and 95 • C for 10 min.The solution was then vortexed for 10 min to increase DNA yield.
We also extracted DNA from the inoculum aliquot to reveal community structure of the source microbiome.Because the source inoculum was expected to include high concentrations of soil-derived compounds, which can inhibit polymerase chain reactions (PCRs), a commercial DNA extraction kit optimized for soil samples was used.Specifically, after 50 μl of the inoculum sample was incubated with 350 μl SDS buffer with proteinase K

PCR and DNA sequencing
For the samples of the experimental microbiomes, prokaryote 16S rRNA V4 region was PCR-amplified with the forward primer 515f fused with 3-6-mer Ns for improved Illumina sequencing quality [37] and the forward Illumina sequencing primer (5 -TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG-[3-6-mer Ns]-[515f]-3 ) and the reverse primer 806rB fused with 3-6-mer Ns and the reverse sequencing primer (5 -GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA G [3-6-mer Ns]-[806rB]-3 ) (0.2 μM each).The buffer and polymerase system of KOD One (Toyobo) were used with the temperature profile of 35 cycles at 98 • C for 10 s, 55 • C for 5 s, and 68 • C for 1 s.To prevent generation of chimeric sequences, the ramp rate through the thermal cycles was set to 1 • C/s [38].Illumina sequencing adaptors were then added to respective samples in the supplemental PCR using the forward fusion primers consisting of the P5 Illumina adaptor, 8mer indexes for sample identification [39], and a partial sequence of the sequencing primer (5 -AAT GAT ACG GCG ACC ACC GAG ATC TAC AC-[8-mer index]-TCG TCG GCA GCG TC-3 ) and the reverse fusion primers consisting of the P7 adaptor, 8-mer indexes, and a partial sequence of the sequencing primer (5 -CAA GCA GAA GAC GGC ATA CGA GAT-[8-mer index]-GTC TCG TGG GCT CGG-3 ).KOD One was used with a temperature profile of 8 cycles at 98 • C for 10 s, 55 • C for 5 s, and 68 • C for 5 s (ramp rate = 1 • C/s).The PCR amplicons of the samples were then pooled after a purification process with the AMPureXP Kit (Beckman Coulter).Primer dimers, which were shorter than 200 bp, were removed from the pooled library by supplemental purification with AMPureXP: the ratio of AMPureXP reagent to the pooled library was set to 1 (v/v) in this process.This library was further purified with E-gel SizeSelect 2 (Invitrogen), and then ca.440-bp DNA fragments were selectively obtained.The sequencing libraries were processed in an Illumina Miseq sequencer (271 forward [R1] and 31 reverse [R4] cycles; 20% PhiX spike-in).
For the source microbiome sample, the prokaryote 16S rRNA V4 region was amplified as well.To estimate concentrations of 16S rRNA genes included in the inoculum, a quantitative amplicon sequencing platform was applied by introducing five "standard DNA" fragments with controlled concentrations to the PCR master mix solution of the first PCR process as detailed elsewhere [40,41].The standard DNAs were used for the in silico calibration of 16S rRNA gene concentrations in the target sample after sequencing as detailed in the previous study [40][41][42].

Bioinformatics
In total, 25 284 304 sequencing reads were obtained with the Illumina sequencing.The raw sequencing data obtained in the Illumina sequencing were converted into FASTQ files using the program bcl2fastq 1.8.4 distributed by Illumina.The output FASTQ files were then demultiplexed with the program Claident v0.2.2018.05.29.The sequencing reads were subsequently processed with the program DADA2 [43] v.1.18.0 of R 3.6.3 to remove low-quality data.In this process, the filterAndTrim function of DADA2 was used with the following criteria: minLen = 200, minQ = 10, PercLQ = 0.1, trancWindow = 5, maxEE = 3, truncR = 240, and avgqual = 15.Potentially chimeric sequences were removed by the removeBimeraDenovo function of DADA2.The molecular identification of the obtained amplicon sequence variants (ASVs) was performed based on the naive Bayesian classifier method [44] with the SILVA v.138.1 database [45].Based on the in silico calibration of the quantitative amplicon sequencing data [40,41], the number of DNA copies in 10 μl of source microbiome (inoculum) was estimated to be 3.12 × 10 6 copies as mentioned above (Fig. S1).

Community-level diversity
The rarefaction curves representing relationship between the number of sequencing reads and the number of ASVs were drawn using the vegan 2.6.4 package [46] of R. In the sequencing of experimental culture samples, the diversity of microbial ASVs reached plateaus along the axis of the number of sequencing reads (Fig. S2).Given the rarefaction curves, the dataset was rarefied to 5000 reads per sample with the rrarefy function of the R vegan package.Of the 2304 samples (8 treatments × 48 replicates × 6 time points), 2250 samples with >5000 reads were used in the following pipeline.After screening for the replicate communities for which sequencing data were available for all the six time points, 2094 samples were subjected to the following statistical analyses.In total, 718 prokaryote ASVs belonging to 19 phyla, 32 classes, 74 orders, 89 families, and 115 genera were detected.
For each sample, two types of α-diversity indices, ASV richness and Shannon H , were calculated.We then compared α-diversity among each medium using Student's t-test corrected by false discovery rate (FDR) by Benjamini-Hochberg method (Table S2).

Overview of the community structure
To visualize the diversity of the prokaryote community structure, an analysis of nonmetric multidimensional scaling (NMDS) was performed based on the Bray-Curtis metric of β-diversity.Likewise, to examine the dependence of community structure on medium conditions, a series of permutational multivariate analysis of variance (PERMANOVA) [47] was performed with 50 000 permutations.In each PERMANOVA model, glucose concentration (high or low; df = 1), the presence/absence of leucine (df = 1), or the presence/absence of citrate (df = 1) was included as the explanatory variable.An additional model including all the medium conditions and interactions between them (df = 7) was examined as well.The coefficients of determination (R 2 ), which indicated the proportion to which given explanatory variables explained variation in dependent variables (community structure), were shown for each of the above PERMANOVA models ("glucose concentration," "leucine," "citrate," or "all" model).Each PERMANOVA model was supplemented by permutational analysis of dispersion (PERMDISP; [48]) for potential effects of the medium condition on dispersion in community structure.The NMDS, PERMANOVA, and PERMDISP were performed for each dataset of the ASV-, genus-, and family-level compositions of prokaryote communities using the vegan 2.6.4 package [46] of R.

Observed community structural differentiation versus null-model expectation
To examine the differentiation of community compositions among replicate samples, the Bray-Curtis metric of β-diversity was calculated for respective pairs of samples collected on the same day in each of the same treatments (medium conditions).Note that the Bray-Curtis metric of β-diversity could range from 0 (identical community structure) to 1 (no overlap of microbes).A histogram of the community structural difference was shown for each day in each experimental treatment.If a small number of alternative states of community structure exist under a medium condition, there can be some peaks of community structural difference within a histogram.
To evaluate whether the obtained histograms of community dissimilarity (β-diversity) represented unimodal distributions, we performed an analysis comparing observed histograms to those expected under purely stochastic models.We obtained average community structure for each time point in each medium condition for each of the ASV-, genus-, and family-level datasets.Stochasticity in initial community compositions was simulated by randomly sampling microbes from the average community structure based on multinomial distributions: the size of sampling was set to 5000 (i.e. the number of sequencing reads per sample in our empirical data) in each of the 48 simulated replicate communities.The subsequent stochastic f luctuations of the community structure were then introduced based on random-walk simulations.Each run of the random-walk simulations was performed through the following three steps.First, the Bray-Curtis metric of β-diversity was calculated for respective pairs of the simulated communities, and the median of β-diversity values was obtained (Step 1).Second, if this median was lower than the median of the empirical data, then one of the 5000 entries (ASVs, genera, or families) in each of the 48 simulated replicate communities was randomly selected and this entry was duplicated (Step 2).Third, one of the 5001 entries was randomly selected and deleted (Step 3).If the median of β-diversity values exceeded that of the empirical data, this random-walk process was completed.Otherwise, the random-walk simulation was repeated from the Step 1.Throughout the null-model analyses, the simulated distributions of community dissimilarity with the same degree of variance as the empirical data were obtained.
We then compared the histograms of the simulated amongreplicate dissimilarity with observed histograms.If all the microbiome compositions within the state space are equally possible (i.e.lack of deterministic processes) and only stochastic events (initial stochasticity and subsequent random-walk dynamics) operate, unimodal distributions would be observed in the empirical histograms as in the simulated histograms.Alternatively, if the presence of both deterministic and stochastic processes results in the divergence of community states, multimodal distributions would be present in the empirical histograms of community dissimilarity among replicate samples.We quantified the deviation of the observed community dissimilarity distributions from the simulated ones based on the Kullback-Leibler metric of divergence as follows: where x denotes the Bray-Curtis β-diversity between replicate samples, P(x) was the observed distributions of Bray-Curtis βdiversity (i.e.empirical data), and Q(x) was the simulated distributions of Bray-Curtis β-diversity.The Kullback-Leibler divergences were calculated by using the philentropy 0.7.0 package [ 49] of R. The larger the Kullback-Leibler divergence is, the more it expresses that the two distributions are different.The analysis was performed for each day in each experimental treatment (medium condition) for each of the ASV-, genus-, and family-level community compositional datasets.

Likelihood ratio tests of dissimilarity distributions
We statistically examined whether the observed histograms of community dissimilarity could be explained by unimodal or multimodal distributions.In this analysis, we applied Gaussian mixture modeling [50].For a given number of normal distributions to be combined (number of components; G), the variance, mean, and relative weights to the other components of normal distributions are estimated.Let each overlapping normal distribution be f (x, θ ) (θ includes mean and variance), and let the distribution of n samples (here, a sample is each dissimilarity of the pairwise communities) be X = {x 1 , x 2 , . . ., x i , . . ., x n }, we can estimate the parameter that maximizes the following equation: where = {π 1 , . . ., π G−1 , θ 1 , . . ., θ G } are the parameters of the mixture model and (π 1 , . . ., π G−1 ) are the mixing weights.Note that π K is defined for each pair of mixture components: when G = 1, the total number of parameters is 2 (θ 1 including mean and variance); when G = 2, the total number of parameters is 5 (each of θ 1 and θ 2 includes mean and variance and mixing weight between components 1 and 2 is denoted by π 1 ; π 2 is defined as In our analysis of the empirical data, statistical models assuming unimodal distributions of community dissimilarity and those assuming two or more modes were compared based on the Bayesian information criterion (BIC).The parameter fitting was performed by the mclust 6.0.0 package [50] of R, and the parameters was obtained via the expectation-maximization (EM) algorithm in this package.We then performed likelihood ratio tests by assuming that deviance (i.e. 2 × log-likelihood-ratio) followed chi-square distributions.The tests were performed for each of the ASV-, genus-, and family-level datasets.

Silhouette method for inferring the number of clusters
We investigated whether the communities belonging to each time point and medium can be divided into a number of clusters using k-medoids clustering based on the Bray-Curtis distance between communities and cluster number selection using the average silhouette coefficient [51].In k-medoids clustering, a standard unsupervised learning approach, replicate microbiomes in each experimental treatment were divided into k clusters according to a given number of clusters k.For each k, the average silhouette coefficient were calculated, and the k that exhibits the highest average silhouette coefficient was selected as the optimal number of clusters.The average silhouette coefficient was calculated by using the cluster 2.1.4package [52].

Potential differentiation in community-level functions
To examine the potential differentiation of ecosystem functions among the replicate communities, we inferred the communitylevel functional profiles based on the phylogenetic estimation of gene repertoires with PICRUSt2 [53].The gene-repertoire data were then subjected to NMDS, PERMANOVA, and PERMDISP based on the Bray-Curtis β-diversity.In total, the relative functional compositions of 392 metabolic pathways/processes were examined as input data.Based on the inferred profiles of metabolic pathways/processes, histograms of the Bray-Curtis β-diversity among replicate samples were drawn at respective time points within respective treatments.
The metabolic pathway/process information was also used to compare potential functional properties among the clusters of microbial compositions.For each of the ASV-, genus-, and family-level data of community compositions, mean pathway abundances were calculated for each of the clusters inferred with the average silhouette coefficient.The metabolic pathways whose pathway abundances varied considerably among those community compositional clusters were explored based on the Shannon entropy metric: 30 metabolic pathways/processes with the lowest Shannon entropy scores were selected from the 392 metabolic pathways/processes.

Overview of the community structure
In each of the experimental treatments (medium conditions), taxonomic richness and Shannon's diversity index at the ASV, genus, and family levels decreased through time (Figs 3A, S3, and S4).The α-diversity indices varied depending also on medium conditions (Table S2).The addition of leucine or citrate, for example, significantly increased α-diversity of the experimental microbiomes (FDR: q < 0.01) (Fig. 3A; see Table S2).In all the experimental treatments, the decline of the number of ASVs reached plateaus until Day 8 (Fig. S4).
At the genus level, the microbiomes were dominated by the four genera, Klebsiella, Raoultella, Pseudomonas, and Cedecea, although there was substantial variation in the balance of these taxa among the medium conditions examined (Fig. 4).For example, in the media containing citrate (Medium-GC, GLC, HGC, and HGLC), the relative abundance of Citrobacter was higher than in other medium conditions.In the media containing leucine (Medium-GL, GLC, HGL, and HGLC), Cedecea were more abundant than in other medium conditions.Moreover, within each of the medium conditions, Serratia became dominant only in some replicate communities (Fig. 4).
At the family level, the communities were characterized by Enterobacteriaceae, Pseudomonadaceae, and Comamodaceae, with Enterobacteriaceae being particularly dominant (Fig. S6).Comamodaceae tended to appear at high proportions in the media with high concentration of glucose.The substantial amongreplicate variation in community compositions observed at the ASV-and genus-level analyses was also evident in the family-level analysis.Specifically, in the media containing leucine (medium-GL, GLC, HGL, and HGLC), Yersiniaceae, which includs Serratia, dominated only in some replicate communities within each experimental treatment.
At all-time points, differences in community compositions were significantly explained by differences in medium (Figs 3B, S7-S8 and Table S3).The results also indicated that the overall variance explained by medium conditions decreased from Day 4 to 12 at the ASV level (Fig. 3B).Among the components of the medium conditions, concentrations of glucose had the largest contributions to community structure at the family level (Fig. S7).Meanwhile, the presence/absence of leucine had the highest impacts on the ASV-and genus-level community structure until Day 6 (Figs 3B and S7).The community structure within the NMDS plot distributed depending on both time points and medium conditions (Fig. 3C).The PERMDISP indicated that the concentration of glucose and the presence/absence of citrate greatly inf luenced dispersion of family-level community structure among samples (Fig. S8 and Table S4).In contrast, the presence/absence of leucine was the major determinant of community structural dispersion at the genus-and ASV-level (Fig. S8).

Community structural differentiation
In some medium conditions (experimental treatments), large community structural differences among replicate samples were observed.For example, in the Medium-GLC treatment, substantial differences in community compositions were observed among replicate samples at the ASV, genus, and family levels, and these differences seemingly increased until Day 10 (Figs 5, S9, and S10).The community compositions seemed to be classified into some categories within the NMDS plot (Figs 5, S9, and S10) due to the dominance of Serratia in some, but not all, replicate samples (Fig. 4).
In fact, the histogram of community structural difference between samples (i.e.Bray-Curtis β-diversity between pairs of replicate samples) showed seemingly bimodal or multimodal distributions (Figs 6, S11, and S12).Such bimodal or multimodal distributions of β-diversity were observed in all the eight medium conditions (Fig. 6), although among-replicate differentiation in community structure on the NMDS surface was conspicuous in some treatments (Medium-G, GL, GLC, HGL, HGC, and HGLC) but not in others (Medium-GC and HG) (Fig. 5).
Within the deep-well plate used in the experiment, substantially different community compositions were observed between adjacent replicate wells in each experimental treatment (Figs S13 and S14).This fact suggests that potential fine-scale heterogeneity of temperature or humidity within the small culture plate does not fully explain the observed among-replicate divergence of community structure.

Observed community structural differentiation versus null-model expectation
The deviations of the observed distributions of community compositional dissimilarity from simulated distributions were then quantified based on the Kullback-Leibler divergence metric.The empirical data with clear multimodal distributions were highlighted by high Kullback-Leibler divergence (e.g. 1 ≤ Kullback-Leibler divergence; Figs 6, S11 and S12, and S15).

Likelihood ratio tests of dissimilarity distributions
The BIC-based model selection in the Gaussian mixture modeling showed that the observed distributions of community compositional dissimilarity (β-diversity) were better explained by models assuming two or more modes than those assuming unimodal distributions.The subsequent likelihood ratio tests further indicated the presence of multiple peaks within the distributions of β-diversity among replicate samples (FDR < 10 −9 for all of the 48 media × day combinations; Table S5).

Silhouette method for inferring the number of clusters
On average, 2.56 (SD = 0.85), 2.56 (SD = 1.11), and 2.81 (SD = 1.20) clusters were identified for the community compositional data at the ASV, genus, and family levels, respectively (Figs S16-S18; Table S5).Replicate samples belonging to respective clusters are shown on the 2D surfaces of NMDS (Figs S16 and S17 and S18).

Potential differentiation in community-level functions
As observed in the above analyses on the community compositions, functional profiles of the community samples significantly differed depending on the medium conditions (Fig. 7A and B).Moreover, among-replicate differentiation was evident in the functional profile dataset (Fig. 7C and D), although the extent of such differentiation varied among medium conditions (Figs S19  and S20).
An additional analysis further suggested that the clusters defined based on taxonomic compositions differed substantially in the inferred functional profiles.The metabolic pathways/processes whose abundances showed the greatest variation among community compositional clusters involved teichoic acid (poly-glycerol) biosynthesis, vanillin/vanillate degradation, and bacteriochlorophyll-a biosynthesis (Fig. S21).

Discussion
Based on the experimental design with many replicates in each of the multiple treatments, we examined how microbial community assembly was driven by deterministic and stochastic  S2; (B) dependence of community structure on medium conditions; in each PERMANOVA model of ASV-level community compositions, glucose concentration (high or low; df = 1), the presence/absence of leucine (df = 1), or the presence/absence of citrate (df = 1) was included as the explanatory variable; an additional model including all the medium conditions and interactions between them (df = 7) was examined as well; the coefficients of determination (R 2 ), which indicate the proportion to which given explanatory variables explain variation in dependent variables (community structure), are shown for each of the above PERMANOVA models ("glucose concentration," "leucine," "citrate," or "all" model); (C) overview of the community structure; the ASV-level community compositions are shown along the axes of NMDS (stress = 0.157).We then observed community compositional variation both among and within experimental treatments (Figs 3-5).Although our experiment is based on a single source microbiome and hence the generality of the findings should be strictly examined in future studies expanding the experimental approach, the results provide an opportunity for discussing how deterministic and stochastic processes operate in microbiome assembly.Hereafter, we discuss the ecological processes potentially underlying the microbiome variation in the framework outlined in Fig. 1.
In terms of deterministic ecological processes, the importance of selection has been discussed for decades in microbiology [7,8,[56][57][58][59][60]. Indeed, in our study, the community structure differed remarkably depending on the medium conditions (Fig. 3), indicating that selection [8,59] operated in the community processes of the experimental microbiomes.For example, Citrobacter, which has the ability to metabolize citrate, occurred almost exclusively in the media containing citrate (Fig. 4).Moreover, the observed distributions of the community compositional dissimilarity among replicate samples deviated quantitatively and qualitative from what was expected under purely stochastic processes (Figs 6 and S11 and S12).Thus, selection can be regarded as a fundamental mechanism determining community structure.
In addition to selection, drift was shown to organize the community structure of the experimental microbiomes.We found that community compositions could diverge into a small number of reproducible alternative states even under the same environmental (medium) conditions (Figs 3-6).This finding indicates that selection and drift could collectively generate alternative states of community structure in microbiome dynamics (Fig. 1).On a stability landscape of community structure, divergence into a small number of reproducible states never occur without selection (Fig. 1).In addition, stochastic processes like drift (demographic f luctuations) are necessary for such divergence because without stochasticity, identical consequences are always expected.Meanwhile, careful interpretation is required when we discuss the stochastic processes caused by drift.In our experiment, drift might inf luence the community assembly not exclusively at the (pre-)colonization stage (i.e.stochastic sampling effects in the pipetting of inoculum microbiomes) but also at the postcolonization stage (i.e."random walk" f luctuation of community compositions).The relative contributions of precolonization and postcolonization stochastic events need to be examined in future studies.In our present experiment, 3.12 × 10 6 DNA copies of the 16S rRNA gene were introduced into each replicate community at the precolonization stage, while 10 μl out of 200 μl of culture media was left for successive time points (i.e. 5% dilution rate in each transition between time points) at the postcolonization stage.Experiments comparing multiple source microbiome density and multiple post-colonization bottleneck levels will provide further crucial knowledge of how drift can drive community dynamics leading to alternative states.
In our results, community structural differences, which were evident at the ASV or genus levels (Figs 3-6), were less conspicuous at the family level (Figs S6, S9, and S11) partly due to the dominance of bacteria belonging to Enterobacteriaceae.Such convergence of community structure at higher taxonomic levels may stem from redundancy in functional compositions of microbial communities as has been reported in previous studies [9,61,62].However, in some experimental treatments (e.g.Medium-GLC), substantial divergence of taxonomic compositions among replicate communities was evident even at the family level (Fig S6).Thus, substantial functional differentiation could occur through the emergence of alternative states of microbiome structure.
In fact, a reference-genome-based analysis suggested that divergence into alternative community structure could entail functional differentiation of microbial communities.The inferred community-level gene repertoires were differentiated into some clusters within each experimental treatment (Fig. 7C), resulting in bimodal or multimodal distributions of pairwise community dissimilarity (Fig. 7D).We further found that the clusters identified in the community compositional analyses differed substantially in their metabolic pathway/process profiles (Fig. S21).This observation is of particular interest because knowledge of the processes driving divergence into functionally different microbiome states is essential in diverse fields of applied sciences.In medicine, for example, the structure of human gut microbiomes varies considerably among host individuals, exhibiting complex associations with host human health [20,22].Furthermore, recent studies have suggested that fishassociated and plant-associated microbiomes can be classified into several community compositional clusters potentially differing in physiological impacts on host organisms [14,63].To extend the discussion on functional divergence of microbiomes, the genomic information of the microbes constituting microbiomes need to be enriched with shotgun metagenomic analyses [64][65][66].
From a detailed inspection of the experimental results, we found that shifts between alternative states could occur in microbial community dynamics.Although differences in community structure were expanded from Day 2 to Day 10, transitions between alternative states seemed to occur within the NMDS plots in some experimental treatments (Medium-GL, GLC, and HGL; Fig. 5).This observation illuminates the ecological theory that transitions between alternative stable states are possible if demographic f luctuations of microbial populations within the communities are large enough to cross the "boundaries" splitting basins of stability landscapes [30,31].Alternatively, the observed dynamics may be interpreted as transient dynamics toward a large basin within a stability landscape [19,33].Although it is notoriously difficult to distinguish alternative transient states from alternative stable states based on current frameworks of empirical datasets, further feedback between theoretical and empirical investigations will promote our understanding of large shifts in community structure [42].Besides, quantitative evaluation of the degree to which community compositions were distributed within state space (Figs S16-S18) will help us establish a comprehensive conceptual framework beyond dichotomy between discrete and continuous community states.
The fact that patterns in the divergence into alternative states differed among environmental conditions give significant implications for the "controllability" of microbiomes.We found that the number and structure of alternative states differed depending on medium conditions (Figs 5 and S9 and S10).This result suggests that the shapes of underlying stability landscapes, which are formed by selection, can be changed by the addition of specific chemicals to the microbial ecosystems.Such changes in stability landscape structure have been intensively discussed in theoretical ecology [31] but explored in a few empirical studies [67,68].Thus, this study indicates that microbiome experiments under a series of environmental conditions provide ideal opportunities for investigating how microbiome structure can be managed by changing the structure of background stability landscapes based on the manipulation of environmental conditions.
Although the experiment using field-collected source microbiomes allowed us to explore broad state space of possible community structure, experiments based on explicitly defined sets of microbial species will provide complementary insights.A more mechanism-based understanding may become possible by conducting similar experiments on multispecies systems with known genomes and metabolic pathways.In this respect, experiments on synthetic communities (SynCom) [69][70][71] are expected to promote reductionistic understanding of microbial community processes [72,73].With the aid of genome-based metabolic modeling [74][75][76], for example, potential consequences of competitive and facilitative interactions between microbial species may be inferred at the community level [66].For further understanding of microbiome assembly, it is also important to apply empirical frameworks for describing complex time-series processes of ecological communities [42,77,78] in light of shifts between alternative states [42].Exploring potential contributions of phages to stochastic dynamics of bacterial communities is another intriguing direction of research.Interdisciplinary studies based on microbiology, genomics, and theoretical ecology will reorganize our knowledge of the stability and dynamics of microbiomes.analyzed the data.Ibuki Hayashi and Hirokazu Toju wrote the paper with HF.

Figure 2 .
Figure 2.Experimental design; laboratory culture system; a source microbiome deriving from forest soil was precultured with cycloheximide under room temperature for 2 days in order to remove eukaryotes; the microbiome inoculum was then introduced into eight types of media (TableS1) with 48 replicates; a fraction of the culture f luid was sampled every 2 days, and equivalent volume of fresh medium was added to the continual culture system throughout the 12-day experiment (8 media × 48 replicates × 6 time points = 2304 community samples).

Figure 3 .
Figure 3. Overview of the community structure; (A) α-diversity of the communities; in each experimental treatment (medium condition), Shannon's diversity index for ASV-level community compositions is shown for each day; the results of Student's t-test are shown in TableS2; (B) dependence of community structure on medium conditions; in each PERMANOVA model of ASV-level community compositions, glucose concentration (high or low; df = 1), the presence/absence of leucine (df = 1), or the presence/absence of citrate (df = 1) was included as the explanatory variable; an additional model including all the medium conditions and interactions between them (df = 7) was examined as well; the coefficients of determination (R 2 ), which indicate the proportion to which given explanatory variables explain variation in dependent variables (community structure), are shown for each of the above PERMANOVA models ("glucose concentration," "leucine," "citrate," or "all" model); (C) overview of the community structure; the ASV-level community compositions are shown along the axes of NMDS (stress = 0.157).

Figure 4 .
Figure 4. Variation in community structure among replicate samples; for each replicate community in each experimental treatment, changes in genus-level community compositions (relative abundance) are shown; the numbers shown at the top of the bar plots refer to time points (days); the replicate samples were ordered based on unweighted pair group method with arithmetic mean (UPGMA) analyses performed on Day 12 for respective experimental treatments; the order of replicate communities on successive days is the same as that on Day 12.

Figure 5 .
Figure 5. Time-series changes in community structure; for each replicate community in each experimental treatment, time-series changes in ASV-level community structure are shown with arrows on the NMDS surface defined in Fig. 3C.

Figure 6 .
Figure 6.Histograms of community structural differentiation; for each experimental treatment, difference in ASV-level community structure (Bray-Curtis β-diversity) between replicate communities is shown as a histogram for each day; the distributions simulated by assuming purely stochastic processes (stochasticity at colonization events and subsequent random-walk processes) and the distributions observed in the empirical data are shown; each panel is color coded according to the score of Kullback-Leibler divergence (KL divergence) between the simulated and observed distributions (Fig. S15); a higher KL divergence score represents greater difference between the simulated and observed distributions; see Figs S11 and S12 for results based on the genus-and family-level community structure.

Figure 7 .
Figure 7. Functional profiles of the microbiome; (A) overview of functional compositions; based on a phylogenetic inference of metabolic pathway compositions (a reference-genome analysis of constituent bacteria based on PICRUSTs2), community samples are plotted on a NMDS surface (stress = 0.125); (B) dependence of community functional profiles on medium conditions; in each PERMANOVA model of metabolic pathway/process compositions, glucose concentration (high or low; df = 1), the presence/absence of leucine (df = 1), or the presence/absence of citrate (df = 1) was included as the explanatory variable; an additional model including all the medium conditions and interactions between them (df = 7) was examined as well; the coefficient of determination (R 2 ) is shown for each day; (C) time-series changes in community functional profiles; the results on two experimental treatments (Medium-GLC and HGC) are shown; see Fig. S19 for full results; (D) histograms of community functional differentiation; the results on two experimental treatments (Medium-GLC and HGC) are shown; see Fig. S20 for full results.