SICOP : identifying significant co-interaction patterns

,


INTRODUCTION
High-throughput experiments resulting in interaction data require computational methods that enable the identification of their key interaction patterns.If the interaction is observed between elements of two different types, the data can be modelled as a bipartite graph in which the elements are represented by nodes and their interaction by edges.More than one type of interaction, e.g.up-and downregulation, can be represented this way by the edges of the graph.
Existing tools for the analysis of bipartite graphs like the R packages bipartite (Dormann et al., 2009) or networksis (Admiraal and Handcock, 2008) are mainly tailored to the purpose of understanding principles in community ecology.Systems biology applications pose three important challenges that these tools cannot cope with: (i) large-scale experiments that are often prone to noise, (ii) mild interaction effects that are difficult to detect and (iii) simultaneously observed distinct types of interactions.
A recently suggested method deals with these problems by using a network analytic approach (Uhlmann et al., 2012;Malumbres, 2012).In this procedure, the number of common interaction partners of two elements is compared with its expected value in a randomized null model that maintains the number of interaction partners of all elements.The approach adopts and extends the general procedure for assessing the statistical significance of the so-called network motifs (Milo et al., 2002).The underlying assumption is that two elements with a statistically significant number of common interaction partners share the same functional role according to the so-called guilt-byassociation principle (Quackenbush, 2003).The method sketched in Figure 1 is based on the theoretical insights formulated in Zweig and Kaufmann (2011) and uses a more realistic null model than other existing approaches.The open source software SICOP (significant co-interaction patterns) implements the algorithm as an easy-to-use client-side tool.It is able to detect significant mild co-interaction effects by taking into account up to two different types of interactions.

SOFTWARE FEATURES
From interaction data between elements of two distinct types, SICOP computes the significance of the number of common interaction partners for all pairs of elements of one of the two types.The most important functionalities of SICOP are as follows.
Data import from diverse input file formats.The tool accepts a list of the observed interactions stored in a text file, a matrix containing the measured level of all interactions in a csv file (comma separated value) or a graph representation of the network in a gml file (graph mark-up language) as produced by graph editing programs such as yEd.
Simplex and duplex network support and precomputational edge filtering.Given the experimental data, SICOP first constructs a bipartite graph model (Fig. 1a and b).If the observed interactions are assigned weights, the user may add a threshold to filter them.If there are two types of interactions, the user can treat them as a single type (simplex network data) or use both of them (duplex network data).*To whom correspondence should be addressed.
ß The Author 2013.Published by Oxford University Press.All rights reserved.For Permissions, please e-mail: journals.permissions@oup.comStatistical significance assessment of co-interactions.SICOP detects patterns in the bipartite graph that are non-random when compared with a null model, which consists of a set of random graphs generated through a series of permutations (Gionis et al., 2007).During permutation, the following key structural properties of the graph are maintained: the number of elements and the number of different types of interactions per element (Fig. 1c).In duplex networks, it is possible to differentiate between common interaction partners regarding all combinations of the two interaction types.On the example of up-and downregulation, the number of co-up-, co-down-and antagonistically regulated partners is evaluated for all pairs of elements.The statistical significance of the individual co-regulations is then quantified by a z-score or a P-value (Fig. 1d).
Multiple data export formats.The same edge selection options are available when exporting the data as when importing it.Thus, the user may create and store multiple networks with different threshold values corresponding to different P-values or z-scores.The obtained co-regulation networks may be exported in any of the input file formats or alternatively as Graphml.
High configurability.Besides the key functionalities described earlier, SICOP allows more confident users to modify the parameter values.The default values are based on theoretical and empirical considerations and are automatically adjusted to the size of the input data.Increasing the preset values enhances accuracy but comes at the cost of additional computational time.We refer interested readers to the manual provided with SICOP.

DISCUSSION
The algorithm was applied to the data from high-throughput screening experiments that show the effect of various microRNAs on a set of proteins that are highly relevant in breast cancer (Uhlmann et al., 2012).MicroRNAs have recently been recognized to act as tumour suppressors and have an oncogene role.Uncovering their targeting patterns is difficult because they are known to only have a moderate effect on the protein expression level.The extensive null model approach behind SICOP enabled the identification of key microRNAs that were shown in subsequent experiments to inhibit cell-cycle progression and proliferation.This specific result confirms that the algorithm behind SICOP is a powerful tool for the detection of statistically significant protein co-regulations.However, as described earlier, SICOP can be applied to a wide range of datasets, for instance, transcription factors binding to DNAs, gene-coding RNAs interacting with co-transcriptional non-coding RNAs, genes in relation with diseases or diseases and their symptoms.Our flexible tool offers an effortless way to explore such data to its full potential, considerably reducing the required amount of further experiments.and one mixed co-interaction pattern (directed edge).The edge labels show empirical P-values computed as the fraction of randomized graphs in which the given co-interaction appeared at least as often as in the original graph

Fig. 1 .
Fig. 1.Illustration of the method behind SICOP: (a) Heatmap showing the results of a hypothetical high-throughput experiment.Two types of interactions are distinguished based on the sign of their strength.(b) Bipartite graph representation of the thresholded interaction data.The two types of interactions are shaded differently.(c) Null model consisting of randomized graphs created from the original bipartite graph by permuting edges of the same type.(d)Resulting co-interaction network that contains three types of edges: two uni-type co-interaction patterns (undirected edges) and one mixed co-interaction pattern (directed edge).The edge labels show empirical P-values computed as the fraction of randomized graphs in which the given co-interaction appeared at least as often as in the original graph