FLYCOP: metabolic modeling-based analysis and engineering microbial communities

Abstract Motivation Synthetic microbial communities begin to be considered as promising multicellular biocatalysts having a large potential to replace engineered single strains in biotechnology applications, in pharmaceutical, chemical and living architecture sectors. In contrast to single strain engineering, the effective and high-throughput analysis and engineering of microbial consortia face the lack of knowledge, tools and well-defined workflows. This manuscript contributes to fill this important gap with a framework, called FLYCOP (FLexible sYnthetic Consortium OPtimization), which contributes to microbial consortia modeling and engineering, while improving the knowledge about how these communities work. FLYCOP selects the best consortium configuration to optimize a given goal, among multiple and diverse configurations, in a flexible way, taking temporal changes in metabolite concentrations into account. Results In contrast to previous systems optimizing microbial consortia, FLYCOP has novel characteristics to face up to new problems, to represent additional features and to analyze events influencing the consortia behavior. In this manuscript, FLYCOP optimizes a Synechococcus elongatus-Pseudomonas putida consortium to produce the maximum amount of bio-plastic (PHA, polyhydroxyalkanoate), and highlights the influence of metabolites exchange dynamics in a four auxotrophic Escherichia coli consortium with parallel growth. FLYCOP can also provide an explanation about biological evolution driving evolutionary engineering endeavors by describing why and how heterogeneous populations emerge from monoclonal ones. Availability and implementation Code reproducing the study cases described in this manuscript are available on-line: https://github.com/beatrizgj/FLYCOP Supplementary information Supplementary data are available at Bioinformatics online.


COMETS (Computation Of Microbial Ecosystems in Time and Space)
combines different metabolic models to design and simulate interactions in microbial consortia in a dynamic context (time and/or space). COMETS is a hybrid model (Perez-Garcia et al. , 2016) that simulates dynamic of metabolite and biomass of multiple strains in the same physical space. The temporal dynamic of this hybrid model is simulated with the updates of the shared media metabolite concentrations and the update of biomasses, according to the metabolic activity of the organisms in the community. COMETS allows to update the metabolite concentrations with different algorithms. The default approach is to update each metabolite multiplying the uptake by the biomass of each strain secreting or consuming that metabolite. The biomass of each strain within the consortium is predicted following the same algorithm in COBRA, i.e. Flux Balance Analysis (FBA), although with some differences. For instance, it runs along several consecutive time slots in a similar way to a dynamic FBA, but it uses a dynamic computation of the bounds of the exchange reactions taking into account the updated concentration of the corresponding metabolite in the media in each time slot, which could be lower than the original bound in the model. Many other parameters could be configured, following the available documentation in the COMETS website ( http://www.bu.edu/segrelab/comets ). SMAC (Sequential Model-based Algorithm Configuration) (Hutter et al., 2011) allows to automate a model parameterization with an iterated local search, guided by ad-hoc fitness. Stochastic local search is one of the most efficient and widely used methods for solving combinatorial problems, which are characterized with an exponential or higher (such as NP) complexity, where approximate solutions are often useful and easier to compute. It allows exhaustive exploration of big state spaces to be avoided, finding good solutions more efficiently than systematic search; although these kinds of algorithms could be incomplete, this means they do not guarantee to find a solution even if one exists. There are different stochastic local search methods (simulated annealing, tabu search, etc.), where the best are the hybrid ones, i.e. combinations of several simple search strategies (such as iterated local search or evolutionary algorithms).
At the beginning, SMAC establishes an initial (default) configuration as a solution. The best solution found is preserved until a better one is reached (i.e. lower fitness, since SMAC minimizes the fitness value) or the algorithm ends; when the pre-defined number of iterations or timeout is reached. In each iteration, random modification of some of the parameter values are applied; first, with small modifications, searching in the neighborhood of the new solution. Where a better configuration is not achieved after several iterations, the algorithm then applies large changes in the values, trying to go out from a local minimum, and exploring a distant area in the solution space. Specific details about condition-specific SMAC setups can be found in the proper Results sections. The problems solved with FLYCOP described in this manuscript applied SMAC v2.10. (Zomorrodi and Maranas, 2012) and d-OptCom (Zomorrodi et al. ,2014)). OptCom allows modeling or evaluation of consortia, not suggesting automatic configuration as FLYCOP does. OptCom requires several executions with the initial compound concentration in each time slot, versus FLYCOP capturing temporal changes. This dynamic limitation was solved with its improved version framework d-OptCom. (d)OptCom optimizes in two levels, the inner level (single strains) with the parameter optimized in the outer level (community). However, FLYCOP optimizes consortia configuration at different levels at the same time, both at single strain level (i.e. uptakes of carbon sources or cross-feeding metabolites) and community level (i.e. initial ratio biomasses and after n hours, total biomass at different time points, metabolite concentration in the media by time slot). Besides, FLYCOP offers f lexibility in the parameters to configure (integer, real or even nominal ones). d-OptCom also uses dFBA to optimize fluxes. Regarding simplicity, FLYCOP does not configure any MILP problem with new reactions for new interactions among strains, nor to known the type of interactions, as (d)OptCom requires. Our approach only needs the individual models and the media composition, and the required interactions will emerge by themselves based on metabolism codified in models. The complementary 'Descriptive d-OptCom' is equivalent to include parameters about the media and initial biomasses in FLYCOP. Regarding experimental kinetics data requirement, one d-OptCom input is the uptake kinetics parameters independent per metabolite (or they assume direct transfer of metabolite between input and output through fluxes), while in FLYCOP it is optional, according to COMETS configuration (which allow define the same kinetics parameters for all metabolites). Thus, FLYCOP could work with less experimental data that d-OptCom, that requires uptake kinetics parameters independent per metabolite. Besides, we could represent the problem in order to find some optimal uptakes in FLYCOP. Finally, d-OptCom is defined as a 'comprehensive computational framework', not providing any software that allows reproducibility or new applications as FLYCOP does.

FLYCOP vs optimizing community parameters (OptCom
FLYCOP vs optimizing stability (SteadyCom (Chan et al. , 2017)). The main difference with FLYCOP is related to flexibility in objective and use-cases, focusing SteadyCom on consortia where the GR must be the same in all members in the community, to ensure co-existence. FLYCOP could optimize consortia with that objective of parallel GR (through defining a fitness function to minimize the difference in GR in exponential phase) among other multiple ones. SteadyCom joins all reactions in a common model, being more similar to lumped network than multi-compartment modeling approach. Apart from FBA, SteadyCom is compatible with FVA. Both ones provide with a similar contribution in the cross-cutting task of predicting composition on microbial consortia. Other similarity is the ability to predict changes in strain relative abundances in response to external perturbations (such as changes in media metabolites) although through distinct approximations: SteadyCom by adding constraints on the model and FLYCOP by including static or dynamic modifications in the media in COMETS configuration file, along the time (or in some specific time point) or running a new optimization with a new media composition or media update. FLYCOP intrinsically explores multiple combinations of model parameters, making them available for further analysis; while SteadyCom requires an external procedure when they want to analysis random consortium configurations. (MultiPlus (Julien-Laferrière et al.,2016)). In contrast to the previous approaches, MultiPlus does not optimize a set of parameters of the metabolic models or community properties as FLYCOP and the remaining ones do, but optimize the fragmentation of a pathway and distribution of the reactions among the species within the community, even selecting the subset of strains. This means it is constrained to be applied in only one use-case (distributing pathway), only one optimization goal (minimizing new reactions and minimizing exchanged metabolites), not flexible in both as FLYCOP. Pathway distribution approaches are usually stoichiometric-independent, simplifying GEM to a representation in an hyper-graph, with just reactions and metabolite relations. It means a drawback versus other approaches because it results in unchecked consortium configurations which could not growth together when it is constrained to stoichiometry, requiring post-processing steps to compute yield (proposing to use a posteriori FBA) and to get stability of the consortium or even including not defined cross-feeding relations; whilst FLYCOP reports multiple values for measuring the performance of the community (GR, increase in biomass, yield, etc.), without requiring additional post-processing, and additionally providing the optimized GEMs as an output, allowing their run together in a microbial community descriptive microbial community system such as COMETS, showing common grow and production of metabolites, and checking if some cross metabolite is danger to the other strains. It is similar to FLYCOP in terms of searching a solution among different configurations. The same as MultiPlus, FLYCOP also could optimize the production of endogenous metabolite, not only exogenous. Regarding to optimizing distribution of reactions, maybe it could be better to apply some of the existing regardless-stoichiometric graph-based methods (Eng and Borenstein, 2016;Julien-Laferrière et al.,2016) to reduce the huge number of possibilities, and between the best outputs, to apply FLYCOP taking the stoichiometry and the consortium dynamic (updates of biomass and metabolite concentration) into account to finally define the optimized distribution of tasks among the strains in the consortium.

LTEE case study: comparing FLYCOP with inverseFBA and evoFBA
Inverse FBA (Zhao et al., 2016) also was applied to this LTEE domain to determine the objective function. Their most relevant conclusion for FLYCOP is that there are infinite objective functions could give the observed fluxes as one of the optimal solutions of an FBA, and only maximizing growth rate is not included in those solutions. Many solutions tend towards strains with higher respiration flux (~= glucose specialist) and other solutions with lower respiration (~= acetate specialist). They did not infer a separate set of fluxes of both strains in the Ara-2 population, but only one set, assuming it is a monoclonal population, ignoring the stable polymorphism. In fact, the fluxes for Ara-2 are more similar to glucose specialist (with slight higher growth than ancestor) than acetate specialist. evoFBA (Großkopf et al., 2016) is an alternative work about LTEE, randomly generating different niches of specialized species along the time, assuming different phenotypes in the same strain, the same as we consider . evoFBA runs the evolution and finds a solution consortia at the end, resulting from intermediate mutations along the evolution. Whereas FLYCOP defines different consortia configuration a priori (it means to apply some mutations apriori) and checks whether its viability and stability along time, finding the best consortium configuration for a predefined goal. evoFBA uses a reduced E.coli metabolic model with 95 reactions, while we use the whole model with thousands of reactions, although the uptake mutations is only in one or a few, we do not remove the remaining model reactions because all are involved in the E.coli metabolism. evoFBA only runs isolated strain models, not simulating co-culture of the acetate-glucose specialists as FLYCOP does.
Both methods predict the cross-feeding among two E.coli specialists as a stable end-point after evolution. Both systems have two optimization levels: a) FBA and b) evolutionary time-scale (in FLYCOP with SMAC). At the evolutionary one, evoFBA constraints the overall input uptakes (carbon and oxygen) per organism, to simulate in a simplified way the trade off observed in in-vivo studies; however, FLYCOP allows flexible "mutations" in the model depending on the parameters you want to determine, given an optimization goal. evoFBA predicts a completely loss of taking acetate in L strains, although invivo data it is really a down-regulation, as FLYCOP allows. evoFBA could determine when the mutant strain appears, while FLYCOP pre-defines it exists, without knowing when it will appear during the evolution. FLYCOP conditions/mutations are only defined at the beginning of the consortia growing point, without changes in uptakes during time, as evoFBA allows. Their mutations are in each time steps along the evolution, not just as the beginning like we do.
A further results comparison is unfeasible, because evoFBA does not simulate co-culture of the acetate-specialist with glucose-specialist, but they only can run dFBA for isolated strains/models.