For comprehensive understanding of precise morphological changes resulting from loss-of-function mutagenesis, a large collection of 1 899 247 cell images was assembled from 91 271 micrographs of 4782 budding yeast disruptants of non-lethal genes. All the cell images were processed computationally to measure ∼500 morphological parameters in individual mutants. We have recently made this morphological quantitative data available to the public through the Saccharomyces cerevisiae Morphological Database (SCMD). Inspecting the significance of morphological discrepancies between the wild type and the mutants is expected to provide clues to uncover genes that are relevant to the biological processes producing a particular morphology. To facilitate such intensive data mining, a suite of new software tools for visualizing parameter value distributions was developed to present mutants with significant changes in easily understandable forms. In addition, for a given group of mutants associated with a particular function, the system automatically identifies a combination of multiple morphological parameters that discriminates a mutant group from others significantly, thereby characterizing the function effectively. These data mining functions are available through the World Wide Web at http://scmd.gi.k.u-tokyo.ac.jp/ .
To study the global regulation of cell morphological characteristics, a number of groups have recently reported genome-wide screening data for yeast mutants with abnormal morphology ( 1 – 5 ). Despite the relatively simple ellipsoidal shape of yeast cells, in the past, cell morphology researchers processed information on cells manually. These time consuming, entirely subjective tasks motivated us to develop image-processing software called CalMorph ( 6 ), which automatically extracts yeast cells from micrographs and processes them to measure morphological characteristics such as cell size, roundness, bud neck position angle, nuclear position and actin localization. Using our software, we have retrieved 1 899 247 cells from 91 271 micrographs of 4782 mutants, which cover almost all of the yeast non-essential mutants cultured from the deleted strains available from EUROSCARF. All cell images, micrographs and quantitative values of morphological parameters are freely available from the SCMD database ( 7 ), which presents information that is complementary to the existing sequence and gene-expression databases ( 8 – 12 ).
CELL IMAGE PROCESSING
Our software processes micrographs of cells stained with fluorescein isothiocyanate–Concanavalin A (FITC-ConA) for cell wall identification, with DAPI to localize nuclei and with Rh-ph to visualize the actin distribution. The photos in Figure 1A show three images stained with the respective dyes. Figure 1B presents the result of combining three photos by superimposing images of the cell wall, nuclei and actin for individual cells.
Figure 1C displays image-processing results. Our image-processing software first identifies the cell wall, attempts to fit an ellipse to each mother cell or bud and colors the cell wall green. The yellow lines show the long and short axes of the fitted ellipses. Bud necks that separate mother cells and buds are illustrated by using two red bullets. Identifying the cell wall makes it easier to determine information on the localization of nuclei and actin patches relative to the cell wall. In Figure 1C , nuclei and actin patches are represented using yellow and light blue bullets, respectively.
Figure 1D shows the primary morphological parameters of cells. The quantitative values of these parameters may change slightly from cell to cell. To perform rigorous statistical analysis of the significance of morphological changes, we need to know the distribution of morphological parameter values for individual cells; this requires that we collect an ample number of image-processed cells and their parameter values. More than 200 image-processed cells were collected for each mutant using a sufficient number of micrographs. Then, ∼500 morphological parameters were calculated for the mutants.
Since there are so many parameters and mutants, some tools for assisting with data mining tasks will help users.
Morphological data should be useful for identifying the morphological changes in particular mutants. Users can query a yeast mutant of interest using its open reading frame name or its gene name. They can also browse average shapes of the mutant, average morphological parameter values, raw and processed micrographs and lists of individual cells associated with morphological parameter values. Users can also provide a typical morphological shape or a particular mutant as a query and ask the system to search for mutants that are similar in shape to the query. This function is called ‘morphology search’ ( 7 ).
Teardrop view—juxtaposition of morphological parameter distributions
In order to understand which morphological parameters of a particular mutant are abnormal, the system displays the distribution of all mutants for each parameter and highlights the focal mutant value in pink (see Figure 2 ). The system juxtaposes the distributions of all parameters in parallel, making it easy for users to comprehend the overview of distributions and abnormal parameters at a glance. Parameters are colored blue or pink if their changes are statistically significant in terms of their distributions.
Mutant classification in terms of morphological parameters
Another promising application of morphological parameters is to use them to predict gene functions. For instance, suppose that one is interested in finding a group of genes involved in a particular biological process such as DNA repairs and cell wall construction. You can ask the system to look for a combination of multiple morphological parameters that discriminate disruptants of genes that are known to be relevant to the biological process of interest (see Figure 3 ). These morphological parameters allow us to define distances between disruptants. If we identify disruptants that are not known to be related to any particular biological process but are closer to disruptants that are relevant to the focal biological process, these disrupted genes are potentially involved in the biological process.
CUSTOMIZATION AND DATA AVAILABILITY
To facilitate customization according to users' interests for the ease of browsing, a dialog-based interface for the parameter selection page helps users choose parameters displayed in datasheets and are memorized in the system. The system also allows users to download the list of selected parameter values for selected mutants in the XML format or in tabular form. Users can also select particular mutants of interest so that they are always shown in Teardrop View and 2D plot.
UPDATES AND FUTURE DIRECTIONS
The web server currently presents morphological parameter values of disruptants of non-essential genes, but mutants of lethal genes will be processed and available in the future.
Funding to pay the Open Access publication charges for this article was provided by Japan Science and Technology Corporation.
Conflict of interest statement . None declared.