MESMER: minimal ensemble solutions to multiple experimental restraints

Motivation: Macromolecular structures and interactions are intrinsically heterogeneous, temporally adopting a range of conﬁgurations that can confound the analysis of data from bulk experiments. To obtain quantitative insights into heterogeneous systems, an ensemble-based approach can be em-ployed, in which predicted data computed from a collection of models is compared to the observed experimental results. By simultaneously ﬁtting orthogonal structural data (e.g. small-angle X-ray scattering, nuclear magnetic resonance residual dipolar couplings, dipolar electron-electron resonance spectra), the range and population of accessible macromolecule structures can be probed. Results: We have developed MESMER, software that enables the user to identify ensembles that can recapitulate experimental data by reﬁning thousands of component collections selected from an input pool of potential structures. The MESMER suite includes a powerful graphical user interface (GUI) to streamline usage of the command-line tools, calculate data from structure libraries and perform analyses of conformational and structural heterogeneity. To allow for incorporation of other data types, modular Python plugins enable users to compute and ﬁt data from nearly any type of quantitative experimental data. Results: Conformational heterogeneity in three macromolecular systems was analyzed with MESMER , demonstrating the utility of the streamlined, user-friendly software. Availability and implementation: https://code.google.com/p/mesmer/ Contact: Supplementary Supplementary data online.


Introduction
Changes in macromolecular structure over a range of timescales are intrinsic to many biological processes, and by definition produce structural heterogeneity within a defined timeframe (Henzler-Wildman and Kern, 2007). Such structural heterogeneity plays a role in functions as diverse as molecular recognition (Boehr et al., 2009;Lange et al., 2008), catalysis (Clare et al., 2008;Eisenmesser et al., 2005), transport (Rout and Aitchison, 2001), regulation (Tang et al., 2007) and allostery (McElroy et al., 2002). This heterogeneity presents unique challenges for structural characterization, as analytical tools that probe molecular structure involve measurements over defined time windows, thereby providing at the extremes either snapshots reflecting the uniqueness of the populations, or a weighted average of the existing configurations. A further complication is that many functionally important states are short-lived or lowly populated, existing amidst a diverse background of other states and intermediates (Andersson et al., 2009;Henzler-Wildmanet al., 2007;Krukenberg et al., 2008). Therefore, in order to understand the mechanistic details of biological processes, it is often necessary to appreciate the range of structural configurations adopted by the system.
One approach to describing these configurations and their abundance is to fit experimental structural data against the predicted values obtained from ensembles of possible structures. Ensemble-based V C The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com techniques identify one or more pools of structures, typically generated by sampling potential conformations that are able to recapitulate experimental data (Bernado et al., 2007;Bertini et al., 2010;Boura et al., 2011;Pelikan et al., 2009). This approach has been applied to a range of structural problems using one of several publically available software packages, such as BILBOMD (Pelikan et al., 2009), EOM (Bernado et al., 2007) and BSS-SAXS (Yang et al., 2010), all of which were developed for analyzing small-angle X-ray scattering (SAXS) or neutron scattering data. Due to the intrinsic limitations in temporal and spatial resolution of structural methods, improved accuracy in modeling can often be obtained by simultaneous fitting of data obtained from orthogonal experimental techniques. Globally fitting such data plays an important role in increasing sensitivity, removing technique-specific bias and cross validation (Beechem, 1992;Marsh and Forman-Kay, 2012). Global-fit ensemble modeling has been implemented in a number of command line-driven software packages, including EROS (Rozycki et al., 2011) and SES (Berlin et al., 2013), and some with user-friendly interfaces such as flexible-meccano (Ozenne et al., 2012) and ENSEMBLE (Marsh and Forman-Kay, 2012), which are specifically designed for incorporation of nuclear magnetic resonance (NMR) restraints to characterize intrinsically disordered proteins.
We describe MESMER, a software package for identifying and analyzing ensembles that fit bulk-average experimental data obtained from a range of experimental techniques such as SAXS, NMR (residual dipolar couplings, RDCs; pseudocontact shifts, PCS) and dipolar electron-electron resonance (DEER) spectra. Because fitting is handled through modular plugins, MESMER can be readily extended to incorporate nearly any kind of experimental structural data, provided that a quantitative comparison can be made between the experimental and predicted data generated from components of an ensemble. The MESMER package also includes a graphical user interface (GUI) that streamlines the process of generating necessary input data, running calculations and interpreting results.

Methodology
MESMER facilitates the analysis of heterogeneous experimental data and interpretation by: • Calculation of all predicted data from input structures without significant user intervention • Simultaneous fitting of multiple data types and experimental conditions • Selection of multiple ensemble solutions • Visualization of structural attributes and relationships MESMER takes as input a collection of protein data bank (PDB) files from which predicted experimental data are computed and saved as 'component' files, one per candidate structure. These precomputed data are rapidly accessed from memory as necessary during fitting. The overall workflow is illustrated in Figure 1. MESMER simultaneously fits experimental data obtained from one or more experimental conditions by collecting the datasets into 'target' files that are to be recapitulated from the provided component structures. This enables users to investigate the effects of experimental variables, such as temperature or ligand concentration, on the prevalence of ensemble components. During fitting, the predicted observable of the ensemble <Y pred > for each type of data is computed from the predicted observations <Y i > of each component present in the ensemble using the coefficient c i to describe the relative contribution (i.e. concentration) of each component (1): In practice, the number of independent degrees of freedom is increased by expanding the ensemble size M until additional components no longer result in a significant improvement to fit quality. An implicit assumption of the minimal ensemble method is that exchange between states that contribute to the observed signal occurs at a rate slower than the time resolution of the experimental measurement; otherwise, a single population-weighted average structure would generate acceptable fits to the experimental data.
If more than one target (i.e. collection of datasets corresponding to a unique set of conditions) is provided, these are fit using the same ensembles, but the relative concentrations of components in each ensemble are optimized separately. During development and testing, we found that a Newtonian conjugate-gradient type optimization (Nash, 1984) of c i rapidly converges to acceptable solutions for small ensembles (M < 10), while stochastic optimizers From each of a starting pool of Z structural candidates, the predicted counterpart data for available experiments is calculated and compiled into component files as input. A total of N ensembles, each containing M components are then assembled, initially randomly, from the pool of available components. The ability of each ensemble to recapitulate the available experimental data dictates its relative fitness for the cyclic genetic algorithm, ensuring components that contribute to accurate fits persist for future iterations (Color version of this figure is available at Bioinformatics online.) E.C.Ihms and M.P.Foster (Gentle et al., 2004) provide rapid yet acceptable results for larger ensembles. These and several additional minimization algorithms are available in MESMER. The overall score of each ensemble, S total , is obtained by summing the goodness-of-fit metrics that quantify the agreement between the predicted and experimental data acquired at T different experimental conditions for each of the K data types, for example (2): The weighting factor w k is used to avoid overemphasis of one data type at the expense of the others, except for cross-validation purposes. Because various structural techniques have different intrinsic resolving power, weights are best assigned according to a priori knowledge of the precision of the measurement or rigorous cross validation. In practice, we have found it valuable to initially fit each dataset individually, thereby determining the quality of fits to the unweighted experimental data, as well inspect the resulting features of the selected ensemble components.

Algorithms
MESMER implements a genetic algorithm in order to iteratively refine ensembles via their predicted agreement with experimental data. Before fitting begins, MESMER generates N ensembles of M components drawn at random from the available pool of components ( Fig. 1). Selecting an appropriate total number of ensembles (N) is important, as a small ensemble population may converge too quickly, while an excessively large population will demand excessive computer resources. In practice, this parameter is adjusted according to the population heterogeneity upon convergence, which is described later. Once the component concentrations of each ensemble in the 'parent' generation have been optimized and the resulting fitness score for each ensemble has been obtained, the pool is duplicated to form a 'child' generation. The child generation is diversified through two mechanisms: Crossing, in which a percentage of an ensemble's components are exchanged with the components of another ensemble and mutation, in which a single component of an ensemble is replaced with a different component, either from components existing in the ensemble pool, or from the pool of unused components. This process is illustrated in Supplementary Figure S1. The frequency of mutation and crossing are handled stochastically, with a user-specifiable rate for each process. After crossing and mutation are complete, the component concentrations for each of the child ensembles are optimized to minimize their respective scores, and the N best fitting ensembles from the 2 N combined parent and child pools are used as the parent population for the next generation ( Supplementary Fig. S1).

Convergence
For an oversampled and structurally diverse pool of starting components, there may be hundreds or even thousands of component combinations that provide fits of comparable quality to the experimental data. Rather than allowing the genetic algorithm in MESMER to drive to a single solution that is not justified by the limited precision of the input data, a pool of comparable ensembles, each containing M components, are retained by stopping the algorithm when either the poorest-fitting ensemble has a total score S total within the certainty in the experimental data (e.g. if a priori knowledge is available about what value constitutes an accurate fit), or when the residual standard deviation (RSD) of scores for the ensembles in the pool is below a specified value. We have found a 1% RSD to provide ensembles with practically indistinguishable fits to the experimental data. In fact, the number of unique solutions may provide valuable information about the sampling of conformation space, as poor diversity in the final ensemble pool can be diagnostic for poor diversity in the starting structural library. In practice, oversampling of conformation space may be difficult, especially in the case of molecules with multiple flexible domains or large unstructured loops, thus, assessment of the diversity of the input structural pool should be assessed before interpreting the results of any ensemble refinement.
Although the genetic algorithm can rapidly identify solutions from a large number of possibilities, it can also lead to an underestimation of structural diversity, as the prevalence of ensembles that fit the data well, and the components that comprise those ensembles, will become greater as the algorithm progresses. This may be addressed by either increasing the total number of ensembles being improved by the algorithm, or by executing MESMER repeatedly and comparing the most prevalent components upon convergence.

Analysis
The MESMER GUI includes tools to analyze the input and resulting ensembles, by generating plots of ensemble scores, component correlations, shared attributes and fits to experimental data ( Supplementary Fig. S2). Optionally, MESMER can also perform bootstrapped error analysis to estimate the confidence intervals of the best-fitting ensemble component concentrations; this is typically performed for only the best-fit ensemble, due to its computational expense (Efron and Tibshirani, 1986). Components can be selected from the original coordinate library and saved to a multi-model PDB along with attribute lists for UCSF Chimera (Pettersen et al., 2004) or scripts for PyMol (http://www.pymol.org/) that provide options for visualizing average component concentrations and their prevalence in the ensemble pool. More advanced options for these analytical tools and others are provided by a collection of command-line utilities.

Results
To test and validate MESMER, we analyzed a range of experimental data types obtained from three biological systems selected for their known or suspected structural heterogeneity. For each system, MESMER analysis indicated that ensemble representation is appropriate, identifying ensembles with greatly improved agreement to the experimental data over that of any single structure or configuration.

ESCRT-I complex
Endosomal Sorting Complexes Required for Transport-I (ESCRT-I) is a multiprotein complex required for sorting and trafficking of cell receptors and other proteins (McDonald and Martin-Serrano, 2009). The binding domains of the various proteins comprising this complex are connected to the central 'stalk' and 'head-piece' of the complex core by flexible linkers ranging from 60 to 13 residues in length ( Fig. 2A). Because these linkers permit structural variability that could allow for concerted opening or closing of the complex by the ubiquitin E2 variant (UEV), C-terminal (CTD) or N-terminal predicted helix domains, ESCRT-I is an attractive test system for ensemble-based analysis of macromolecules that are conformationally restricted, but structurally variable. Previously (Boura et al., 2011), SAXS and double electron-electron resonance (DEER) experiments between spin-probe pairs were used to restrain describe the variable structure of this complex, and in this work formed the basis for testing the utility of MESMER.
First, we generated 50 000 structural models (conformers) of the ESCRT-I complex using Xplor-NIH (Schwieters et al., 2003) by treating the UEV, CTD, NTH and core domains as rigid bodies, while the linkers and unstructured C-terminal residues of the Mvb12 protein in the complex core were subjected to Monte Carlo randomization of backbone torsion angles ( Supplementary Fig.  S4B). Linkers were defined as previously described (Boura et al., 2011): residues 159-217 of VPS23, residues 118-147 of VPS28 and residues 33-46 of VPS37. Geometric violations and steric clashes were removed through a brief energy minimization. SAXS data for the structural models were calculated with FoXS (Schneidman-Duhovny et al., 2013), which in comparison with profiles calculated by CRYSOL, provided more accurate fits to the experimental data.
Because the MTSL spin labels used to collect the DEER data experience a significant amount of motion about their tethers (Galiano et al., 2009), the program MMM (Polyhach et al., 2011) was used to sample conformation space around the tether for each labeled domain. The resulting spin-probe labeled domains were then superimposed onto the matching unlabeled domains for each of the Monte Carlo models, resulting in structures of the complex decorated with possible configurations of the spin probe ( Fig. 2A). The predicted DEER spectrum for each label-label pair was then calculated as previously described (Boura et al., 2011). The modulation depth k was optimized for DEER data and a consistent scaling factor was applied to the SAXS Y pred to minimize the reduced X 2 score for each ensemble to their respective experimental observations (3): Relative weighting between the SAXS and each of the three DEER restraints was obtained by fitting each restraint separately, and assigning a weight to the restraint that would provide a final best-fit fitness score of 1.0 upon convergence (Supplementary Table  S2). When each set of DEER data was fit independently, ensembles with two components contributing roughly equal proportion were necessary in order to fit each set of experimental data. This result is consistent with the bimodal distribution of interlabel distances for each pair previously observed, possessing interlabel distance distributions centered at 25.9 and 48.4 Å for VPS23, 21.9 and 40.7 Å for VPS28, 24.7 and 42.7 Å for VPS37 (Boura et al., 2011) ( Supplementary Fig. S4C, Table S1). However, when the three sets of DEER data were fit simultaneously, two-component ensembles could no longer accurately recapitulate the experimental data, requiring that the ensemble size be increased to six-the same number reported by Boura et al. (2011). This result implied that our library was deficient in pairs of models in which all fours sets of constraints could be simultaneously satisfied. This could in principle reflect the requisite diversity of the structures, or alternatively, due to inadequate sampling of the variable domain positions. Indeed, inspection of the conformational properties of the randomly generated library revealed that the majority of conformations in the library possessed interprobe distances far from those consistent with the experimental DEER data ( Supplementary Fig. S4B,C).
To better sample conformations with the potential to satisfy the experimental data, a second library of 10 000 ESCRT-I structures were generated by Monte-Carlo randomization of linker torsions, but with the added condition that VPS23, VPS28 and VPS37 inter-label distances were independently restrained to those observed in the individual DEER fits (Supplementary Table S1, Fig. S4D). Using this enhanced library, a converged fit (total fitness score RSD <1%) identified 19 unique two-component ensembles, which exhibited satisfactory global fits to the combined DEER and SAXS data ( Fig. 2B-E, Supplementary Table S2). A modest deviation of the ensemble fit to the SAXS data at high scattering angles may be due to incomplete modeling of solvation around unstructured regions (Schneidman-Duhovny et al., 2013) or because of the partial structuring of one or more linkers; ensembles of three or more components resulted in minor improvement to SAXS fit quality ( Supplementary Fig. S5), but did not improve fits to the DEER data or change conclusions about positioning of the variable domains. The structural attributes of model components present in the selected ensembles were consistent with previous findings (Boura et al., 2011) that featured the UEV domain of VPS23 close to the core, despite its 59-residue long linker ( Fig. 2F; Supplementary Figs S3, S4E). The configuration of the VPS28 CTD domain displayed a slightly greater variability, but it was typically localized around the core's headpiece domain. Such variability implies a larger extent of motion for this domain, possibly reflecting equilibrium between an open and a closed state. We found a preponderance of structures with the NTH domain of VPS37 adopting two roughly collinear poses, which are typically oriented away from the VPS23 UEV. While the hypothesis of cooperative or concerted positioning of the ECRT-I domains is consistent with the finding that the experimental data can be fit with minimal ensembles containing only two, equally-contributing poses of each domain, the data cannot rule out a more complex model. Moreover, the variability in the orientations of the VPS23 and VPS28 flexible domains with respect to the complex's core indicate that the SAXS data lack the precision required to unambiguously position these domains ( Fig. 2F; Supplementary Figs S3, S4E). These findings demonstrate the utility of MESMER for diagnosing and revealing structural features from highly variable systems restrained by a collection of experimental data. The iterative approach described here also demonstrates how the selective fitting of individual restraints in MESMER can be used to rationally design component libraries, enhancing sampling of reasonable conformations.

Calmodulin
The calmodulin calcium-binding family of signal messenger proteins consists of two rigid domains separated by a flexible linker of seven amino acids, which in some crystallographic studies has been shown to form a long, uninterrupted alpha helix (Chattopadhyaya et al., 1992;Grabarek 2005). NMR experiments have indicated that this linker is largely unstructured in solution (Zhang et al., 1995), permitting significant reorientation of the two domains with respect to one another (Anthis et al., 2011;Bertini et al., 2004) (Fig. 3A). This dynamic behavior, and the potential for populating one or more favored conformational states makes Calmodulin an attractive candidate for ensemble-based analysis of conformational heterogeneity. The N60D mutant of calmodulin has been shown to preferentially bind lanthanide metal ions in the N-terminal domain, allowing the collection of paramagnetic pseudocontact shifts (PCS) and RDCs by NMR (Bertini et al., 2004). These observations provide a bulkaverage measure of the relative orientation between the two domains. In addition to paramagnetic restraints, SAXS data for the protein is also available (Bertini et al., 2010).
The program RanCh (Bernado et al., 2007), which treats two domains as rigid bodies and randomizes the connecting linker backbone Phi and Psi angles, was used to generate a library of 50 000 model structures from the atomic coordinates of the N-and C-terminal domains in the Maximum Occurrence example dataset (Bertini et al., 2010); we used RanCh in this case instead of XPLOR because of lower computational cost and enhanced sampling. As CRYSOL resulted in superior fits to the SAXS data than FoXs for calmodulin, it was used to calculate the predicted SAXS profiles for each structure, with 128 points per simulated profile to a maximum scattering angle of 0.35 4psin(h) (1/Å ). Solvent electron density was set to 0.334 e/Å 3 , and a hydration shell contrast of 0.03 e/Å 3 was used. The magnetic susceptibility tensor orientation and anisotropies for terbium and thulium were obtained by fitting the N-terminal domain PCS values using the program Numbat (Schmitz et al., 2008) and matched previously published values (Bertini et al., 2004). These tensors were subsequently used to calculate predicted PCS and RDCs for the model structures using a MESMER plugin based on the PyParaTools python library (M. Stanton-Cook, X.-C. Su, G. Otting, T. Huber, http://comp-bio.anu.edu.au/mscook/PPT/). The agreement score between the predicted PCS and RDC values and the experimental results was obtained from the quality score Q, as previously defined (Cornilescu et al., 1998) (4): Relative weighting of the SAXS, PCS and RDC restraints was determined in the same manner as described for the ESCRT-I system restraint types (Supplementary Table S3). As RDC values are sensitive to small structural perturbations (Zweckstetter and Bax, 2002), initial fits were made with only the SAXS and PCS data (Fig. 3A-C). An ensemble size of six components was found to provide accurate fits, with no significant improvement in fit quality obtained using larger ensembles. Components present in ensembles obtained from an RSD <1% pool of solutions exhibited C-terminal domains adopting a wide variety of poses with respect to the N-terminal domain, consistent with previous reports (Fig. 3D and E), while certain configurations with otherwise sterically permitted geometries (gray dots in Fig. 3D) were not observed in the final pool of selected ensembles (blue dots in Fig. 3D), although they were present in the component pool and starting ensembles. Because of the single, significantly shorter linker, and the fairly uniform coverage of interdomain distances and alignments (Fig. 3D) undersampling was judged to be significantly less problematic than with the larger ESCRT-I complex.
However, when RDC data were included as a restraint, the ensemble size M necessary to generate reasonable fits increased to at least eight components ( Supplementary Fig. S9), and significantly narrowed the diversity of structures that populate converged ensembles (Fig. 3D, red dots), typically resulting in a single unique ensemble of eight structures ( Fig. 3F; Supplementary Fig. S10). As multiple independent MESMER runs selected different ensembles upon convergence (yet all with similar scores), this was judged to be a side effect of the number of ensembles used during the fitting, which is limited by computer memory, and not inadequate sampling of structural variability in the starting structure library. This is consistent with the increased resolution of RDC restraints compared to PCS and SAXS data, and is supported by cross-validation plots of PCS and RDC restraints indicating that structures obtained by fitting RDC data alone can be used to accurately recapitulate PCS data, but not vice versa (Supplementary Fig. S11A and B). Overall, these results are consistent with the present understanding of the structural plasticity of calmodulin, including the presence of favored configurations and the absence of disfavored ones (Anthis et al., 2011;Barbato et al., 1992;Bertini et al., 2004). Such observations confirm the utility of MESMER in the ensemble analysis of protein structural variability.

TRAP and AT hetero-oligomerization
The homo-oligomeric TRAP and Anti-TRAP (AT) proteins participate in a polydentate condensation process that leads to the formation of heterogeneous chains or clusters of the components (Ihms et al., 2014). Thermodynamic and structural analysis of the chaining process indicates that multiple AT trimers (AT 3 ) bind to sites on the surface of the ring-shaped TRAP undecamer (TRAP 11 ). Thus, at limiting AT 3 ratios, clusters are formed as increasing numbers of TRAP 11 rings bind to AT 3 . However, as the AT 3 :TRAP 11 ratio increases, additional AT 3 s bind to TRAP 11 , resulting in smaller, AT-saturated clusters. This condition-dependent modulation of hetero-complex abundance offers a unique opportunity to examine the utility of MESMER to detect the configuration and relative abundance of species present in a heterogeneous solution.
As previously described (Ihms et al., 2014), bead models of AT 3 þ TRAP 11 clusters were assembled from models of the components obtained from ab initio shape reconstruction. The SAXS profiles for the models were calculated by CRYSOL with 256 points per simulated profile to a maximum scattering angle of 0.32 4psin(h) (1/Å ). Solvent electron density was set to 0.334 e/Å 3 , and a hydration shell contrast of 0.03 e/Å 3 was used. Nine separate SAXS curves from samples at three AT 3 :TRAP 11 ratios (0.5, 1.0 and 1.5) and three TRAP concentrations (3, 5 and 10 mg/ml) were obtained at pH 8 (Ihms et al., 2014). Under basic conditions, complexes exist in rapid equilibrium with each other and the free components, resulting in an average complex size smaller than those present at neutral pH (E. Ihms, unpublished results). The SAXS curves for each condition were simultaneously constrained with an additional, equally weighted stoichiometry restraint S ratio to ensure agreement between the AT 3 :TRAP 11 ratios in the ensembles and in the experimental sample (5): where the ratio of each component (R i ) is calculated from either the concentration of each protein present in the sample (C i ), or the number of TRAP or Anti-TRAPs present in the ensemble (n i ) (6): Ensembles with four components were sufficient to fit each of the SAXS datasets with an average chi-square <1.5 ( Fig. 4A-C, Supplementary Fig. S12; Table S4) while increasing the ensemble size beyond six components provided little additional reduction in chi-square ( Supplementary Fig. S12). A preponderance of extended configurations, with multiple TRAP 11 rings stitched together by AT 3 (Fig. 4D) were required to fit the data. The only component present in all of the selected ensembles was free TRAP 11 , although its contribution to the ensemble did not exceed 5% at the lowest concentration of TRAP and AT (Fig. 4E). Examination of the ensemble component contributions as a function of the AT 3 :TRAP 11 ratio reveals several interesting trends. The free TRAP 11 component did not significantly contribute in ensemble fits to either the 1 or 1.5 AT 3 :TRAP 11 data, and the contribution of the AT 3 -2TRAP 11 component also decreases dramatically in fits to data obtained at higher ratios. The 1AT 3 -1TRAP 11 component only significantly contributes in fits to data with !1 AT 3 :TRAP 11 ratios, suggesting that multiple TRAPs cooperatively bind AT, but not necessarily vice versa (Fig. 4E).
The relatively small concentration range tested has little apparent effect on the relative contributions of the components, consistent with the fact that the affinity of TRAP 11 for AT 3 is well above the concentrations tested. However, the contribution of the most ATsaturated complex is directly proportional to both AT 3 :TRAP 11 and increasing component concentration, while the contribution of the less saturated AT 3 -TRAP 11 complexes concomitantly decrease (Fig. 4E).
These results demonstrate the utility of MESMER for describing the structure and population of species present in heterogeneous solutions, and how these properties vary under experimental conditions. This utility may find value in investigating how ligands or other variables modulate conformational or oligomerization equilibria.

Discussion
Obtaining meaningful results from ensemble-based approaches is contingent upon various factors. These include the discriminating power of the input data, the proper sampling of conformational space and the selection of an appropriate ensemble size. If the ensemble size is too small, insufficient structural variability will typically result in poor fits, while overly large ensembles run the risk of over-fitting the experimental data and potentially including unreasonable components. Indeed, the ensemble size embodies the difference between 'minimal ensemble' and other ensemble-based techniques. Some methods that employ larger ensembles and thus greater computational requirements, such as EROS (Rozycki et al., 2011) typically avoid over-fitting by using statistical methods, such as maximum entropy which restricts excessive optimization of ensemble component concentrations/contributions. However, even these approaches are subject to significant pitfalls, most notably that insufficient or inaccurate sampling of conformational space may result in unnecessarily large ensembles. In the other extreme, it is possible that an artificially small set of component structures may provide acceptable fits to experimental data while insufficiently describing the system's true heterogeneity. Both concerns may be partially addressed by cross validation with different types of experimental data, and examination of the variability present in multiple converged solutions.
MESMER facilitates the process of finding structural ensembles that can recapitulate data obtained from heterogeneous solutions. We have validated MESMER by investigating a variety of case studies previously shown to exhibit structural heterogeneity. Despite this heterogeneity, the averaged experimental data available from these systems were sufficient to report on specific structural characteristics, providing new insights into the systems' function. MESMER introduces several innovations that may find utility in similar ensemble-based methods, streamlining much of the necessary workflow while retaining flexibility. MESMER's GUI is intended to enable a wide user base to readily and intuitively investigate complicated structural heterogeneity, while its extensible and modular nature enables advanced users to incorporate new experiments and analytical techniques.