## Abstract

Multilayer relationships among entities and information about entities must be accompanied by the means to analyse, visualize and obtain insights from such data. We present open-source software (

## Introduction

Although the study of networks is old, the analysis of complex systems has benefited particularly during the last two decades from the use of networks to model large systems of interacting agents [1]. Such efforts have yielded numerous insights in many areas of science and technology [2–16].

In the case of biological networks, connections among genes, proteins, neurons and other biological entities can indicate that they are part of the same biological pathway or exhibit similar biological functions. Network representations focus on connectivity, and they have now become a paradigmatic way to investigate the organization and functionality of cells [17–26], synaptic connectivity [27–36] and more. There are also myriad applications to other types of systems (e.g. in sociology, transportation, physics and more) [1, 37–41].

In parallel, a large variety of computational techniques have been developed to analyse (and visualize) networks and the information that they encode. In biology, for example, such methods have become important tools for attempting to understand and represent cell functionality. However, although the standard network paradigm has been very successful, it has a fundamental flaw: it forces the aggregation of multilayer information to construct network representations that include only a single type of connection between pairs of entities. This can lead to misleading results, and it is becoming increasingly apparent that a more complicated representation is necessary [41].

Recently, a novel mathematical framework to model and analyse multilayer relationships and their dynamics was developed [42, 43]. In this framework, one represents the underlying network topology and interaction weights as a *multilayer network*, in which entities can exhibit different relationships simultaneously and can exist on different ‘layers’. Multilayer networks can encode much richer information than what is possible using the individual layers separately (which is what is usually done). This, in turn, provides a suitable framework for versatile and sophisticated analyses that have already been used successfully to reveal multilayer community structure [42] and to measure important nodes and the correlations between them [43–46]. However, to meet the requirements of an operational toolbox to be applied to the analysis of complex systems, it is of paramount importance to also develop open-source software to visualize multilayer networks and to represent the results of analysing such networks in a meaningful way.

Multilayer networks have already yielded fascinating insights and are experiencing burgeoning popularity. For example, there have been numerous studies to attempt to understand how interdependencies [47, 48], other multilayer structures [44–46, 49–53], dynamics [54–58] and control [59] can improve understanding of complex interacting systems. See the recent review article [41] for extensive discussions and a thorough review of results.

The use of increasingly complicated network representations has yielded a new set of challenges: how should one visualize, analyse and interpret multilayer data. Although there has been progress in numerous applications, many of the key results have concentrated on data from examples like social and transportation networks [41]. Multilayer analysis has rarely been exploited in the investigation of biological networks—even though such a perspective is clearly relevant—and we believe that the lack of appropriate software has contributed to this situation. For example, in a recent study, the genetic and protein–protein interaction networks of *Saccharomyces cerevisiae* were investigated simultaneously [26] to uncover connection patterns. Costanzo *et al.* [26] also reported that genetic interactions have an overlap of 10–20% with protein–protein interaction pairs, which is significantly higher than the $$3{\% }$$ overlap that they expected based on a random null model. This suggests that many positive and negative interactions occur between—rather than within—complexes and pathways [26] and thereby gives an important example of how exploiting multilayer information might improve understanding of biological structure and functionality.

The aforementioned overlap is an indication of correlation between a pair of networks, and the analysis of multilayer data would benefit greatly from techniques and diagnostics that are able to exploit multiplexity (i.e. multiple different ways to interact) in available information.

## Methods

The primary contributions of the present work are to address the computational challenge of analysis and visualization of multilayer information by providing a practical methodology, and accompanying software that we call

### Visualization

In multilayer networks, nodes can exist in several layers simultaneously, and entities that exist in multiple layers (such nodes have ‘replicas’ on other layers) are connected to each other via interlayer edges. One can visualize a multilayer network in

The

*ordinal*and

*categorical*. In ordinal multilayer networks, interlayer edges exist only between layers that are adjacent to each other with respect to some criterion (e.g. temporal ordering). In contrast, categorical multilayer networks include interlayer edges between replica nodes from every pair of layers. For the sake of simplicity, we illustrate

For instance, let us examine the genetic-interaction and profile-correlation networks of a cell as different layers of a multilayer network. Such networks were aggregated into a single network in Ref. [26]. In Fig. 1(A), we show multilayer visualizations that we created using

*Caenorhabditis elegans*. In this example, each layer corresponds to a different type of synaptic connection [24].

In panels (A) and (C) of Fig. 1, we use a layout in which the positions of the nodes are the same in each layer. We determine the positions of nodes by combining two of the standard force-directed algorithms available in

One can also use

### Compression of layers and reducibility dendrograms

An important open question is the determination of how much information is necessary to accurately represent the structure of multilayer systems and whether it is possible to aggregate some layers without loss of information. It was shown recently that it is possible to compress the number of layers in multilayer networks in a way that minimizes information loss by using an information-theoretic approach [64]. The methodology of [64], which we implemented in

The compression procedure from [64] proceeds as follows. For each pair of layers in the original multilayer network,

One uses the distances between every pair of layers as the components of a matrix, and one can then perform hierarchical clustering [68] using any desired method to produce a dendrogram that indicates the relatedness of the information in the different layers. In

### Annular visualization of multilayer information

It is a challenging problem to represent, visualize and analyse the wealth of information encoded in the multilayer structure of networks in a compact way. Preserving more information by using multilayer networks rather than ordinary networks complicates the visualization and analysis even further. However, this complication is necessary, because otherwise one might end up with misleading or even incorrect results [41]. We developed the

To give a concrete example, many researchers are interested in ranking the relative importance of nodes (and other network structures), which traditionally is accomplished using various ‘centrality’ measures. Centralities have been calculated for single-layer networks for several decades [1, 37], and numerous notions of centrality are now also available for multilayer networks [41, 44]. It is therefore necessary to develop visualization tools that make it possible to compare such a wealth of diagnostics to each other in a compact, meaningful way. For example, it is often worthwhile to focus attention on one descriptor and compare the values obtained in each layer separately to the values obtained from the multilayer network and its aggregations. This is easy to do using the

We will now illustrate our annular visualization (see Figs. 4, 6, 8 and 10) using the example of multilayer centrality measures. Suppose that we have different arrays of information, where one should think of each array as having resulted from the calculation of some centrality diagnostic on a multilayer network. We visualize each array using a ring. The angle indicates node identity (regardless of the layer or layers in which it occurs). We bin the centrality values—e.g. either linearly or logarithmically—and we assign a colour to each bin to encode its value. Both the type of binning and the colour scheme are customizable in

One can also use the same principles when fixing some centrality descriptor and letting the rings correspond to the layers in a network, the multilayer network and an aggregated network (see Section 3). Such a plot might help to reveal, for instance, whether the large centrality of a node in a multilayer network is due primarily to its centrality value in a specific layer or if the aggregated network provides a reasonable proxy for such multilayer structure.

## Analyses of empirical multilayer networks

To demonstrate the ability of

*Xenopus laevis*and show a network visualization in Fig. 3(C). We give results of computations using

One can examine the global organization of nodes into modules (i.e. ‘communities’) through an algorithmic calculation of community structure [70, 71]. For example, one can obtain dense communities in multilayer networks by optimizing a multilayer generalization of the modularity quality function [42]. To do this, one takes into account both intralayer and interlayer edges, and one seeks densely connected sets of nodes (i.e. communities) that are sparsely connected to each other when compared with some multilayer random-graph (null) model [41, 42, 60]. See Fig. 3(A) for a visualization of communities in *X. laevis* and Section 3.1 for other examples.

As we discussed previously, one can quantify the importance of a node by using various diagnostics to measure ‘centrality’. One calculates such a centrality (and a corresponding rank order) for each node by using multilayer generalizations of centrality measures [41, 43, 44]. The software

Researchers are often also interested in considering a ‘compressed version’ of multilayer data sets that preserve as much information as possible without altering the primary descriptors. For such scenarios, it is possible to use the compression procedure discussed in Section 2.2 to identify the layers of a multilayer network that are providing redundant information [64] (see Fig. 3(D)).

In Fig. 3(E), we show three correlation measures for multilayer networks: (left) mean edge overlap, (centre) degree–degree Pearson correlation coefficient and (right) degree–degree Spearman correlation coefficient. In this example, the degree–degree Pearson and Spearman correlation coefficients between layers quantify the tendency of nodes to be hubs in different layers simultaneously. The

To summarize all of the information that one obtains from calculations like the ones above in a compact figure, we use an annular visualization (see Section 2.3) that facilitates the ability to capture patterns to deduce qualitative information about multilayer data. In Fig. 4 (see the panel labelled ‘Multiplex’), we show an example for centrality diagnostics, which measure the importance of nodes in various ways. Each ring indicates a centrality measure, and the angle determines the identity of a node in a network, regardless of the layer(s) in which it exists. One can use the same principles when fixing some centrality descriptor and letting the rings correspond to the layers in a network, the multilayer network and an aggregated network (see the other panels in Fig. 4). For the case of layers, one calculates a centrality measure for each layer separately without accounting for multilayer structure. For instance, it is evident that rings 3 (‘DirInt’ layer) and 5 (‘PhAssoc’ layer) are negatively correlated in the case of strength centrality because nodes tend to have opposite colours, whereas rings 6 (aggregated network) and 7 (multiplex network) are positively correlated, as expected for strength centrality. Our annular representation makes it easy to see similarity (or dissimilarity) in rank orderings according to different diagnostics. For example, their patterns reveal that physical association and direct interaction are dominant and determine the multilayer strength in the depicted example. In other cases (see Section 3.1), the ranking by some centrality measure in the multilayer network is poorly correlated to the ranking in either an aggregated network or in individual layers separately. This underscores the value of using a multilayer framework for the calculation of the most central proteins (and, more generally, for determining which entities in complex systems are most important).

### Analysis of other empirical multilayer networks

In this section, we present multilayer analyses of three additional biological systems to illustrate the power of

As we did for the case of *X. laevis*, we include two figures for each example. In the first set of figures (see Figs. 5, 7 and 9), we show the following information:

*Panel (A)*: Multilayer community structure from modularity maximization [42]. The colour of each node encodes its community assignment in a multilayer-network visualization. For comparison, we also show the results (and corresponding visualization) of community detection on an aggregated network, which we obtain by summing the corresponding intralayer edge weights of all layers. (In other words, if $$A_{ijs}$$ gives the edge weight between nodes $$i$$ and $$j$$ on layer $$s$$, then we obtain an aggregated edge weight $$W_{ij}$$ between nodes $$i$$ and $$j$$ by summing over $$s$$.)*Panel (B):*Multilayer PageRank centrality [44]. We again use a multilayer-network visualization. We label the top five nodes from a ranking according to multilayer PageRank centrality. For comparison, we also show the results of PageRank centrality calculations on the aforementioned aggregated network.*Panel (C):*Edge-coloured multigraph visualization of the network. We colour edges according to the layer to which they belong. We colour the nodes according to their layer (or layers); if a node exists on multiple layers, then we distribute its corresponding colours evenly.*Panel (D):*Compressibility, analysis and corresponding reducibility dendrogram [64]. We show the distance matrix and the corresponding dendrogram, which we obtain using Ward hierarchical clustering.*Panel (E):*Measures of correlation between layers: (left) mean edge overlap, (centre) degree–degree Pearson correlation coefficient and (right) degree–degree Spearman correlation coefficient.

In the second set of figures (see Figs. 6, 8 and 10), we show our annular visualization for the centrality descriptors: We specify the order of the rings in the list of labels on the right of each plot. In each case, the top label refers to the innermost ring and the bottom label refers to the outermost ring.

In each panel titled ‘Multiplex’, we consider a multilayer network. Each ring corresponds to a different centrality descriptor.

In the other panels, we consider a specific centrality descriptor (which we specify in the title of the panel). Each ring encodes the values of that descriptor, which we calculate in each layer separately. We also include rings for the calculation of the corresponding centrality diagnostic in the multilayer network and in its aggregation to a single-layer weighted network.

## Conclusion

In the current era of ‘big data’, there is now an intense deluge of multilayer data. To avoid throwing away important information or obtaining misleading results, it is increasingly crucial to use methods that exploit multilayer structure. In this paper, we present new software and associated methodology that exploits the new paradigm of multilayer networks, and we illustrate how it can be used to analyse and visualize several examples. Our software,

## Funding

All authors were supported by the European Commission FET-Proactive project PLEXMATH (Grant No. 317614; the project website is http://www.plexmath.eu/). A.A. also acknowledges the financial support from the Generalitat de Catalunya 2009-SGR-838, the ICREA Academia, and the James S. McDonnell Foundation. M.A.P. acknowledges a grant (EP/J001759/1) from the EPSRC.

#### Appendix: Technical details about muxViz

We developed

The

Using

The

## Acknowledgement

The authors thank Serafina Agnello for support with graphics.

## References

*Escherichia coli*