## Abstract

In most natural and engineered systems, a set of entities interact with each other in complicated patterns that can encompass multiple types of relationships, change in time and include other types of complications. Such systems include multiple subsystems and layers of connectivity, and it is important to take such ‘multilayer’ features into account to try to improve our understanding of complex systems. Consequently, it is necessary to generalize ‘traditional’ network theory by developing (and validating) a framework and associated tools to study multilayer systems in a comprehensive fashion. The origins of such efforts date back several decades and arose in multiple disciplines, and now the study of multilayer networks has become one of the most important directions in network science. In this paper, we discuss the history of multilayer networks (and related concepts) and review the exploding body of work on such networks. To unify the disparate terminology in the large body of recent work, we discuss a general framework for multilayer networks, construct a dictionary of terminology to relate the numerous existing concepts to each other and provide a thorough discussion that compares, contrasts and translates between related notions such as multilayer networks, multiplex networks, interdependent networks, networks of networks and many others. We also survey and discuss existing data sets that can be represented as multilayer networks. We review attempts to generalize single-layer-network diagnostics to multilayer networks. We also discuss the rapidly expanding research on multilayer-network models and notions like community structure, connected components, tensor decompositions and various types of dynamical processes on multilayer networks. We conclude with a summary and an outlook.

## Introduction

Network theory is an important tool for describing and analysing complex systems throughout the social, biological, physical, information and engineering sciences [1–3]. Originally, almost all studies of networks employed an abstraction in which systems are represented as ordinary graphs [4]: the ‘nodes’ (or ‘vertices’) of the graphs represent some entity or agent, and a tie between a pair of nodes is represented using a single, static, unweighted ‘edge’ (or ‘link’). Self- and multi-edges were also typically ignored. Although this approach is naive in many respects, it has been extremely successful. For example, it has been used to illustrate that many real networks possess a heavy-tailed degree distribution [5,6], exhibit the small-world property [7,8], contain nodes that play central roles [2,1] and/or have modular structures [9–11].

As research on complex systems has matured, it has become increasingly essential to move beyond simple graphs and investigate more complicated but more realistic frameworks. For example, edges often exhibit heterogeneous features: they can be directed [12,1,2], have different strengths (i.e. ‘weights’) [2,13,14], exist only between nodes that belong to different sets (e.g. bipartite networks) [1,2,15] or be active only at certain times [16,17]. Most recently, there have been increasingly intense efforts to investigate networks with multiple types of connections (see Section 2.5) and so-called ‘network of networks’^{1} [19] (see Section 2.4). Such systems were examined decades ago in disciplines like sociology and engineering, but the explosive attempt to develop frameworks to study multilayer complex systems and to generalize a large body of familiar tools from network science is a recent phenomenon.^{2}

In social networks, one can categorize edges based on the nature of the relationships (i.e. ties) or actions that they represent [2,21,22]. Reducing a social system to a network in which actors are connected in a pairwise fashion by only a single type of relationship is often an extremely crude approximation of reality. As a result, sociologists recognized decades ago that it is crucial to study social systems by constructing multiple social networks using different types of ties among the same set of individuals [2,23].^{3} For example, consider the sociograms^{4} that were drawn in the 1930s to represent social networks in a bank-wiring room [26]. These sociograms depicted relations between 14 individuals via 6 different types of social interactions (see Fig. 6(b)). In the sociology literature, networks in which each edge is categorized by its type are called ‘multiplex networks’ [27,28] or ‘multirelational networks’ [2]. (Such networks are also said to possess ‘multi-stranded’ relationships [29].) Social networks also often include several types of nodes (e.g. males and females) or hierarchical structures (e.g. individuals are part of organizations), which have been studied using ‘multilevel networks’ (see Section 2.8). The notion of a ‘network of networks’ also dates at least as far back as 1973 [30]. The tools that have been developed to investigate multilayer social networks include exponential random graph models (ERGMs) [31,32], meta-networks and meta-matrices [33,34], and methods for identifying social roles using blockmodelling and relational algebras [35–41].

In the computer-science and computational linear-algebra communities, tensor-decomposition methods [42,43] and multiway data analysis [44] have been used to study various types of multilayer networks (see Sections 4.2.4 and 4.5.2). These types of methods are based on representing multilayer networks as adjacency tensors of ‘rank’^{5} higher than 2 (i.e. of ‘order’ higher than 2) and then applying machinery that has been developed for tensor decompositions. Perhaps the most widespread methods that use this approach are generalizations of the singular value decomposition (SVD) [45], and these and other tools have been extremely successful in many applications [43]. For example, tensor-decomposition and multiway-data-analysis methods can be used to extract communities (i.e. sets of nodes that are connected densely to each other) [42] or to rank nodes [46,47] in multilayer networks. A clear benefit of a tensor representation is that one can directly apply methods from the tensor-analysis literature to multilayer networks—e.g. by using dynamic tensor analysis [48] to study multiplex networks that change in time.

Networked systems that cannot be represented as traditional graphs have also been studied from a data-mining perspective. For example, heterogeneous (information) networks were developed as a general framework to take into account multiple types of nodes and edges [49–51]. Similarly, one can use meta-matrices to conduct a dynamic network analysis [52] that incorporates temporal and spatial information, node attributes and types, and other types of data about social networks in the same framework. Meta-matrices have been employed in the context of ‘organizational theory’, as organizations, people, resources and other types of entities are all interconnected [33,34].

Interconnected systems have been examined in the engineering literature as a source of cascading failures [53–55]. Analogous to the notion of ‘systemic risk’ in financial systems, increasing connectivity—including the interconnectedness of different systems in an infrastructure—has the potential to increase large-scale events. In the last several years, these ideas have been formalized using interacting networks (and interdependent networks) [56,57]. For example, it has been shown (especially using percolation processes) that interconnected systems can react to random failures in a manner that is different from ‘monoplex’ (i.e. single-layer) networks. For some types of cascading-failure processes, an interdependent system can exhibit a ‘first-order’ (i.e. discontinuous) phase transition instead of the ‘second-order’ (i.e. continuous) phase transitions that are typical for monoplex systems [57,58]. We will discuss this issue in more detail in Section 4.6.2.

In the last couple of years, it has suddenly become very fashionable to study networks with multiple layers (or multiple types of edges) and networks of networks.^{6} Unfortunately, the sudden and immense explosion of papers on multilayer networks has produced an equally immense explosion of disparate terminology, and the lack of a consensus (or even generally accepted) set of terminology and mathematical framework for studying multilayer networks is extremely problematic.^{7} Additionally, research on generalizing monoplex-network concepts such as degree, transitivity, centrality and diffusion is only in its infancy. We also expect that it will be necessary to define many concepts that are intrinsic to multilayer networks.

In this paper, we present a general definition of multilayer networks that can be used to represent most types of complex systems that consist of multiple networks or include disparate and/or multiple interactions between entities. We review the existing literature and find a natural mapping from each type of network to this multilayer-network representation, and we classify the numerous existing notions of multilayer networks based on the types of constraints that they impose on this multilayer-network representation (see Table 1).

Name | Aligned | Disj. | Eq. Size | Diag. | Lcoup. | Cat. | $$|L|$$ | $$d$$ | Example refs. |
---|---|---|---|---|---|---|---|---|---|

Multilayer network | ✓ | ✓ | ✓ | Any | 1 | [68] | |||

✓$$\dagger $$ | ✓$$\dagger $$ | Any | 1 | [67] | |||||

Multiplex network | ✓$$\dagger $$ | ✓$$\dagger $$ | ✓ | Any | 1 | [69,67] | |||

✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [70–76] | ||

✓ | ✓ | ✓ | ✓ | ✓ | 2 | 1 | [77–79] | ||

✓ | ✓ | ✓ | Any | 1 | [80] | ||||

✓ | ✓ | ✓ | ✓ | Any | 1 | [81–83] | |||

Multivariate network | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [31] | |

Multinetwork | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [84] | |

✓ | ✓ | ✓ | ✓ | ✓ | Any | 2 | [85] | ||

Multirelational network | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [2,50,86,87] | |

Multirelational data | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [88,89] | |

Multilayered network | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [83,90–92] | |

Multidimensional network | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [93–99] | |

✓ | ✓ | ✓ | ✓ | ✓ | Any | 3 | [100] | ||

Multislice network | ✓$$\dagger $$ | ✓$$\dagger $$ | ✓ | Any | 1 | [66,101–103] | |||

Multiplex of interdependent networks | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [104] | |

Hypernetwork | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [105,106] | |

Overlay network | ✓ | ✓ | ✓ | ✓ | ✓ | 2 | 1 | [107,108] | |

Composite network | ✓ | ✓ | ✓ | ✓ | ✓ | 2 | 1 | [109] | |

Multilevel network** | ✓ | Any | 1 | [110,111] | |||||

✓ | ✓ | ✓ | Any | 1 | [80,112] | ||||

Multiweighted graph | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [113] | |

Heterogeneous network | ✓ | 2 | 1 | [49,50] | |||||

Multitype network | ✓ | Any | 1 | [114,115,65] | |||||

Interconnected networks | ✓ | ✓ | 2 | 1 | [116,117] | ||||

✓ | 2 | 1 | [118,119] | ||||||

Interdependent networks* | ✓ | ✓ | 2 | 1 | [57] | ||||

* | ✓ | 2 | 1 | [120] | |||||

✓ | 2 | 1 | [121] | ||||||

✓ | 2 | 1 | [122,123] | ||||||

✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [124] | ||

Partially interdependent networks* | ✓ | 2 | 1 | [125] | |||||

Network of networks* | ✓ | Any | 1 | [126] | |||||

Coupled networks | ✓ | ✓ | ✓ | Any | 1 | [127] | |||

Interconnecting networks | ✓ | ✓ | ✓ | 2 | 1 | [128] | |||

Interacting networks | ✓ | Any | 1 | [56,129] | |||||

✓ | 2 | 1 | [122] | ||||||

Heterogenous information network** | Any | 2 | [51,130–132] | ||||||

✓ | Any | 1 | [133] | ||||||

Meta-matrix, meta-network | Any | 2 | [34,134,135] |

Name | Aligned | Disj. | Eq. Size | Diag. | Lcoup. | Cat. | $$|L|$$ | $$d$$ | Example refs. |
---|---|---|---|---|---|---|---|---|---|

Multilayer network | ✓ | ✓ | ✓ | Any | 1 | [68] | |||

✓$$\dagger $$ | ✓$$\dagger $$ | Any | 1 | [67] | |||||

Multiplex network | ✓$$\dagger $$ | ✓$$\dagger $$ | ✓ | Any | 1 | [69,67] | |||

✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [70–76] | ||

✓ | ✓ | ✓ | ✓ | ✓ | 2 | 1 | [77–79] | ||

✓ | ✓ | ✓ | Any | 1 | [80] | ||||

✓ | ✓ | ✓ | ✓ | Any | 1 | [81–83] | |||

Multivariate network | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [31] | |

Multinetwork | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [84] | |

✓ | ✓ | ✓ | ✓ | ✓ | Any | 2 | [85] | ||

Multirelational network | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [2,50,86,87] | |

Multirelational data | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [88,89] | |

Multilayered network | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [83,90–92] | |

Multidimensional network | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [93–99] | |

✓ | ✓ | ✓ | ✓ | ✓ | Any | 3 | [100] | ||

Multislice network | ✓$$\dagger $$ | ✓$$\dagger $$ | ✓ | Any | 1 | [66,101–103] | |||

Multiplex of interdependent networks | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [104] | |

Hypernetwork | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [105,106] | |

Overlay network | ✓ | ✓ | ✓ | ✓ | ✓ | 2 | 1 | [107,108] | |

Composite network | ✓ | ✓ | ✓ | ✓ | ✓ | 2 | 1 | [109] | |

Multilevel network** | ✓ | Any | 1 | [110,111] | |||||

✓ | ✓ | ✓ | Any | 1 | [80,112] | ||||

Multiweighted graph | ✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [113] | |

Heterogeneous network | ✓ | 2 | 1 | [49,50] | |||||

Multitype network | ✓ | Any | 1 | [114,115,65] | |||||

Interconnected networks | ✓ | ✓ | 2 | 1 | [116,117] | ||||

✓ | 2 | 1 | [118,119] | ||||||

Interdependent networks* | ✓ | ✓ | 2 | 1 | [57] | ||||

* | ✓ | 2 | 1 | [120] | |||||

✓ | 2 | 1 | [121] | ||||||

✓ | 2 | 1 | [122,123] | ||||||

✓ | ✓ | ✓ | ✓ | ✓ | Any | 1 | [124] | ||

Partially interdependent networks* | ✓ | 2 | 1 | [125] | |||||

Network of networks* | ✓ | Any | 1 | [126] | |||||

Coupled networks | ✓ | ✓ | ✓ | Any | 1 | [127] | |||

Interconnecting networks | ✓ | ✓ | ✓ | 2 | 1 | [128] | |||

Interacting networks | ✓ | Any | 1 | [56,129] | |||||

✓ | 2 | 1 | [122] | ||||||

Heterogenous information network** | Any | 2 | [51,130–132] | ||||||

✓ | Any | 1 | [133] | ||||||

Meta-matrix, meta-network | Any | 2 | [34,134,135] |

The framework that we advocate is able to handle networks with multiple modes of multiplexity (e.g. networks that are both multiplex and temporal [85,67]), although it is important to note that the vast majority of the scholarly literature has been concerned thus far with networks that possess only a single ‘dimension’ (i.e. what we call an ‘aspect’) of layers. As discussed in Ref. [67], a multilayer network with a single aspect *already* yields the inherent ‘new physics’ of multiplexity, though writing down a framework that explicitly enumerates more aspects is convenient for expository purposes (in particular, to connect with terminology that has already been introduced into the literature) and to draw additional connections with applications. In this review, we concentrate primarily on multiplex networks that are represented as edge-coloured multigraphs or sequences of networks, but we also give some attention to structures such as interdependent networks and networks of networks. One can also view a temporal network as a type of multilayer network, but we almost always leave them out of our discussion because they are reviewed elsewhere [16]. We also briefly discuss connections between multilayer networks and network structures such as hypergraphs, multipartite networks, and networks that are both node-coloured and edge-coloured.

The remainder of this paper is organized as follows. In Section 2, we present a general formulation for multilayer networks and map all of the reviewed types of multilayer networks to this general framework. We are thereby able to translate between the many existing notions for studying multilayer networks. In Section 3, we review existing data sets that have been examined using multilayer-network frameworks and discuss other types of data that would be useful to study from such a perspective. In Section 4, we examine models of multilayer networks, methods and diagnostics that have been introduced (or generalized from single-layer networks) to analyse and measure the properties of multilayer networks, and dynamical processes on multilayer networks. We conclude in Section 5. We also include a glossary of important terms in Appendix.

## Multilayer networks

We start by presenting the most general notion of a multilayer-network structure that we will use in this article and by defining various constraints for that structure. We then show how this structure can be represented as an adjacency tensor [67] and how one can reduce the rank (i.e. order) of such a tensor by constraining the space of possible multilayer networks or by ‘flattening’ the tensor. Taken to its extreme, such a flattening process yields ‘supra-adjacency matrices’ (i.e. ‘super-adjacency matrices’) [136,69,137], which have the advantage over tensors of being able to represent missing nodes in a convenient way. (When implementing methods for computation, most people are also much more familiar with working with matrices than with tensors.) We then discuss numerous multilayer-network structures—multiplex networks, networks of networks, etc.—that have been formulated in the literature, and we show how they can be represented using our formulation of multilayer networks. Most of the ways of representing these network structures as general multilayer networks only cover a subset of all of the possible multilayer networks (see Fig. 1), and these subsets can be characterized by the constraints that they satisfy (i.e. by the properties of these constructions). In Table 1, we summarize the properties of various multilayer-network structures from the literature when they are represented using the general multilayer-network structure. Finally, we discuss the relationship between multilayer networks and hypergraphs, temporal networks and certain other types of networks.

### General form

A graph (i.e. a single-layer network) is a tuple $$G=(V,E)$$, where $$V$$ is the set of nodes and $$E \subseteq V\times V$$ is the set of edges that connect pairs of nodes [4]. If there is an edge between a pair of nodes, then those nodes are *adjacent* to each other. This edge is *incident* to each of the two nodes, and two edges that are incident to the same node are also said to be ‘incident’ to each other.

To represent systems that consist of networks at multiple levels or with multiple types of edges (or with other similar features), we consider structures that have *layers* in addition to nodes and edges. Our goal is to start from the most general structure of this kind and to yield existing notions of multilayer (and multiplex, etc.) networks by introducing relevant limitations and constraints. In our most general multilayer-network framework, we allow each node to belong to any subset of the layers, and we are able to consider edges that encompass pairwise connections between all possible combinations of nodes and layers. (One can further generalize this framework to consider hyperedges that connect more than two nodes.) That is, a node $$u$$ in layer $$\boldsymbol {\alpha }$$ can be connected to any node $$v$$ in any layer $$\boldsymbol {\beta }$$. Moreover, we want to consider ‘multidimensional’ layer structures in order to include every type of multilayer network construction that we have encountered in the literature. For example, one ‘dimension’, which we henceforth call an *aspect* (but for which the term ‘feature’ might also be reasonable), of a layer might be the type of an edge and another aspect might be the time at which an edge is present. The above use of the word ‘dimension’ amounts to a standard English meaning of the word to mean aspect or feature, but the standard use of the monicker ‘dimension’ as jargon in mathematics and physics compels us to use different terminology. Additionally, in the social-networks literature, one might discuss different ‘dimensions’ of interactions between people (friendship, family etc.), so that a dimension would then correspond to a layer in a multilayer network. We wish to avoid this terminology clash as well.

We will now give a precise definition of a multilayer network based on our above description.^{8} A multilayer network has a set of nodes $$V$$ just like a normal network (i.e. a graph). In addition, we need to have layers. However, because we want to be able to include multiple aspects in a multilayer network, we cannot restrict ourselves to having a single set of layers. For example, in a network in which the first aspect is interaction type and the second one is time, we need one set of layers for interaction types and a second set of layers for time (e.g. for time stamps). To avoid confusion, we use the term ‘elementary layer’ (see Fig. 2) for an element of *one* of these sets and the term ‘layer’ to refer to a combination of elementary layers from all aspects. In the previous example, an interaction type and a time stamp are both examples of an elementary layer, and a combination of an interaction type and a time stamp constitutes a layer. A multilayer network can have any number $$d$$ of aspects, and we need to define a sequence $$\textbf {L}=\{ L_a \}_{a=1}^d$$ of sets of elementary layers such that there is one set of elementary layers^{9}$$L_a$$ for each aspect $$a$$.

Using the sequence of sets of elementary layers, we can construct a set of layers in a multilayer network by assembling a set of all of the combinations of elementary layers using a Cartesian product $$L_1 \times \dots \times L_d$$. We want to allow nodes to be absent in some of the layers. That is, for each choice of a node and layer, we need to indicate whether the node is present in that layer. To do so, we first construct a set $$V \times L_1 \times \dots \times L_d$$ of all of these combinations and then define a subset $$V_M \subseteq V \times L_1 \times \dots \times L_d$$ that contains only the node-layer combinations in which a node is present in the corresponding layer.^{10} We will often use the term *node-layer tuple* (or simply *node-layer*) to indicate a node that exists on a specific layer. Thus, the node-layer $$(u,\alpha _1,\dots ,\alpha _d) $$ represents node $$u$$ on layer $$(\alpha _1,\dots ,\alpha _d)$$.

In a multilayer network, we need to define connections between pairs of node-layer tuples. As with monoplex networks, we will use the term *adjacency* to describe a direct connection via an edge between a pair of node-layers and the term *incidence* to describe the connection between a node-layer and an edge. Two edges that are incident to the same node-layer are also ‘incident’ to each other. We want to allow all of the possible types of edges that can occur between any pair of node-layers—including ones in which a node is adjacent to a copy of itself in some other layer as well as ones in which a node is adjacent to some other node from another layer. In normal networks (i.e. graphs), the adjacencies are defined by an edge set $$E \subseteq V \times V$$, in which the first element in each edge is the starting node and the second element is the ending node. In multilayer networks, we also need to specify the starting and ending layers for each edge. We thus define an edge set $$E_M$$ of a multilayer network as a set of pairs of possible combinations of nodes and elementary layers. That is, $$E_M \subseteq V_M \times V_M$$.

Using the components that we set up above, we define a *multilayer network* as a quadruplet $$M=(V_M,E_M,V,\textbf {L})$$. See Fig. 2(a) for an illustrative example. Note that if the number of aspects is zero (i.e. if $$d=0$$), then the multilayer network $$M$$ reduces to a *monoplex* (i.e. single-layer) network. In that case, $$V_M = V$$, so the set $$V_M$$ becomes redundant. (By convention, the product term in the set $$V \times L_1 \times \dots \times L_d$$ does not exist if $$d = 0$$.)

The first two elements in a multilayer network $$M$$ yield a graph $$G_M=(V_M,E_M)$$, so one can interpret a multilayer network as a graph whose nodes are labelled in a certain way (see Fig. 2(b)). This observation makes it easy to generalize some of the basic concepts from monoplex networks to multilayer networks. For example, we define a *weighted* multilayer network $$M$$ by defining weights for edges in the underlying graph $$G_M$$ (i.e. by mapping each edge of a network to a real number using a function $$w: E_M \rightarrow \mathbb {R}$$), and we say that a multilayer network $$M$$ is undirected (or directed) if the underlying graph $$G_M$$ is undirected (or directed). We also employ the usual convention of disallowing self-edges in the multilayer network by disallowing self-edges in the underlying graph, but this can of course be relaxed. Expressing these concepts directly in terms of the multilayer-network formalism is trivial but a bit cumbersome when allowing an arbitrary number of aspects. To alleviate this problem, we denote the array of elementary layers using a bold typeface: $$(u,\boldsymbol {\alpha }) \equiv (u,\alpha _1,\dots ,\alpha _d)$$. With this notation, we write, for example, that a multilayer network is undirected if $$((u,\boldsymbol {\alpha }),(v,\boldsymbol {\beta })) \in E_M \implies ((v,\boldsymbol {\beta }),(u,\boldsymbol {\alpha })) \in E_M$$; and we disallow self-edges by requiring that $$((u,\boldsymbol {\alpha }),(u,\boldsymbol {\alpha })) \notin E_M$$.

It is both typical and convenient to use different semantics for edges that cross layers than for edges that stay within a single layer. It is thus often useful to partition the set of edges into *intra-layer edges*$$E_A=\{ ((u,\boldsymbol {\alpha }),(v,\boldsymbol {\beta })) \in E_M | \boldsymbol {\alpha }=\boldsymbol {\beta } \}$$ and *inter-layer edges*$$E_{C}=E_M \setminus E_A$$ (see, e.g. Refs. [82,137,140], and Fig. 2 for an example). We also define *coupling edges*$$E_{\skew {3}\tilde {C}} \subseteq E_{C}$$ as edges for which the two nodes represent the same entity in different layers: $$E_{\skew {3}\tilde {C}}=\{((u,\boldsymbol {\alpha }),(v,\boldsymbol {\beta })) \in E_C | u=v\}$$. We thereby define an *intra-layer graph*$$G_A=(V_M,E_A)$$, an *inter-layer graph*$$G_{C}=(V_M,E_C)$$ and a *coupling graph*$$G_{\skew {3}\tilde {C}}=(V_M,E_{\skew {3}\tilde {C}})$$.^{11}

As we discuss later, we can obtain existing notions of multilayer networks (and similar objects) from the literature by applying various constraints to our general framework. We now give names to these constraints. We say that a multilayer network is *node-aligned*^{12} (or ‘fully interconnected’) if all of the layers contain all nodes: $$V_M = V \times L_1 \times \dots \times L_d$$. A multilayer network is *layer-disjoint* if each node exists in at most one layer: $$(u,\boldsymbol {\alpha }), (u,\boldsymbol {\beta }) \in V_M \implies \boldsymbol {\alpha }=\boldsymbol {\beta }$$. We say that couplings are *diagonal* if all of the inter-layer edges are between nodes and their counterparts in another layers: $$E_{\skew {3}\tilde {C}}=E_C$$. We say that a diagonal multilayer network is *layer-coupled* if the coupling edges and their weights are independent of the nodes [83]: if $$((u,\boldsymbol {\alpha }),(u,\boldsymbol {\beta })) \in E_C$$ and $$(v,\boldsymbol {\alpha }),(v,\boldsymbol {\beta }) \in V_M$$, then $$((v,\boldsymbol {\alpha }),(v,\boldsymbol {\beta })) \in E_C$$ and $$w(((u,\boldsymbol {\alpha }),(u,\boldsymbol {\beta })))=w(((v,\boldsymbol {\alpha }),(v,\boldsymbol {\beta })))$$ for all $$u,v,\boldsymbol {\alpha },\boldsymbol {\beta }$$. That is, for any two layers, the coupling is the same for all nodes (so it depends only on the layers). The couplings are *categorical* if each node is adjacent to all of its counterparts in the other layers: $$(u,\boldsymbol {\alpha }),(u,\boldsymbol {\beta }) \in V_M \implies ((u,\boldsymbol {\alpha }),(u,\boldsymbol {\beta })) \in E_M$$.

We say that couplings are categorical with respect to a single aspect if each node is adjacent to all of its counterparts in layers that only differ in that aspect. When a network is node-aligned, we denote the number of entities (which are represented by nodes) by $$n=|V|$$. Note that the number of nodes in each layer can be equal to each other (i.e. the layers have the same size) even if a network is not node-aligned. In this case, the node identities in the different layers are different. When $$d=1$$, there is a single additional aspect beyond an ordinary monoplex network, so we denote the set of layers by $$L=L_1$$ and the number of layers by $$b=|L|$$. Note that categorically-coupled multilayer networks are a subset of layer-coupled networks, which are in turn a subset of diagonal networks. We summarize the constraints that we have just discussed in the Glossary (see Appendix), and we illustrate these constraints with an example multilayer network in Fig. 3(a).

### Tensor representations

It is often convenient to represent ordinary graphs (i.e. monoplex networks) using adjacency matrices. For a node-aligned multilayer network, we represent the analogous structure using a rank-$$2(d+1)$$ [i.e. order-$$2(d+1)$$] adjacency tensor^{13} [67] $$\mathcal {A} \in \{ 0,1 \} ^{|V| \times |V| \times |\textbf {L}_1| \times |\textbf {L}_1| \times \dots \times |\textbf {L}_d| \times |\textbf {L}_d| }$$. We use the simplified notation^{14}$$\mathcal {A}_{uv\boldsymbol {\alpha }\boldsymbol {\beta }}=\mathcal {A}_{uv\alpha _1\beta _1 \dots \alpha _d\beta _d}$$, where the tensor element $$\mathcal {A}_{uv\boldsymbol {\alpha }\boldsymbol {\beta }}$$ has a value of $$1$$ if and only $$((u,\boldsymbol {\alpha }),(v,\boldsymbol {\beta })) \in E_M$$; otherwise, $$\mathcal {A}_{uv\boldsymbol {\alpha }\boldsymbol {\beta }}$$ has a value of $$0$$. As a convention, we group indices according to layer type (i.e. according to aspect). Analogous to weighted adjacency matrices in monoplex networks, we can define a weighted adjacency tensor $$\mathcal {W}$$ in which the value of each element corresponds to the weight of an edge (and to $$0$$ when there is no edge).

Technically, the above tensor representation is only appropriate for a multilayer network that is node-aligned. However, as discussed in Ref. [67], many of the tensor-based methods for network analysis can still be employed by adding extraneous nodes (which we will call *empty nodes*) that are not adjacent to any other nodes. In Table 1, we mark ‘multislice’ [66], ‘multilayer’ [67] and some types of multiplex networks with the symbol $$\dagger $$ to illustrate this important point: although these frameworks technically require a network to be node-aligned, they have been used explicitly (by ‘padding’ them with empty nodes) to consider networks in which it is *not* the case that all nodes are shared between all layers.^{15} Such ‘padding’ yields a structure that is node-aligned from a mathematical perspective, but one has to be very careful in practice when studying the resulting tensors. From a technical perspective, appending empty nodes makes it possible to use these adjacency tensors as inputs in other frameworks—e.g. this has been very successful for studies of community structure in the type of multilayer network known as ‘multislice networks’ [66,101] (and sometimes also called ‘multilayer networks’ in subsequent papers [103])—but it can lead to highly misleading results when computing network diagnostics, such as mean degree or clustering coefficients, unless one accounts for the presence of empty nodes in an appropriate way. One must be cautious. In theory, one could also represent layer-disjoint networks using adjacency tensors by renaming (or enumerating) the nodes that start from 1 in each layer and padding using empty nodes if the layers are not of equal size. We discuss this type of renaming in Section 2.4 and illustrate it in Fig. 5.

#### Constraints.

We now formulate some of the multilayer network constraints that we defined in the previous section using the above tensorial framework. We assume that the multilayer network is node-aligned. Constraints force some of the tensor elements to be $$0$$ and makes it possible to represent constrained multilayer networks using tensors with a lower rank than would otherwise be necessary. If a multilayer network has diagonal couplings—i.e. if inter-layer edges are only allowed between two representations of the same node—then the adjacency-tensor element $$\mathcal {A}_{uv \boldsymbol {\alpha }\boldsymbol {\beta }}=0$$ if the two node indices differ from each other (i.e. if $$u \neq v$$) and the layers are not the same (i.e. if $$\boldsymbol {\alpha } \neq \boldsymbol {\beta }$$). With this restriction, one can express a multilayer network as a combination of an intra-layer adjacency tensor with elements $$\mathcal {A}_{uv \boldsymbol {\alpha }}=\mathcal {A}_{uv \boldsymbol {\alpha }\boldsymbol {\alpha }}$$ and a coupling tensor with elements $$\mathcal {C}_{u \boldsymbol {\alpha }\boldsymbol {\beta }}=\mathcal {A}_{uu \boldsymbol {\alpha }\boldsymbol {\beta }}$$. This is equivalent to the ‘multislice’ formulation in Ref. [66] and its sequels.

If one disallows inter-aspect couplings in a diagonal multilayer network, then the adjacency tensor also has elements equal to $$0$$ when the layer indices differ in more than one aspect; that is, $$\mathcal {A}_{uv \boldsymbol {\alpha }\boldsymbol {\beta }}=0$$ if either $$u \neq v$$ and $$\boldsymbol {\alpha } \neq \boldsymbol {\beta }$$, or if $$u = v$$ but $$\boldsymbol {\alpha }$$ and $$\boldsymbol {\beta }$$ differ in more than one element. (Recall that an element of $$\boldsymbol {\alpha }$$ corresponds to one aspect.) One can now represent the multilayer network using an intra-layer adjacency tensor with elements $$\mathcal {A}_{uv \boldsymbol {\alpha }}$$ and a sequence of ‘reduced’ coupling tensors (one for each aspect $$a$$) with elements $$C_{u\boldsymbol {\alpha }\beta }^a=\mathcal {A}_{uu\alpha _1 \alpha _1 \ldots \alpha _{a-1}\alpha _{a-1}\alpha _a\beta \alpha _{a+1}\alpha _{a+1} \ldots \alpha _d \alpha _d}$$. If $$d=1$$, then there is no difference between the reduced coupling tensor and the original coupling tensor (i.e. $$\mathcal {C}_{u \alpha _1\beta _1}=C_{u\alpha _1\beta _1}^1$$).

If there are no inter-aspect couplings and the coupling strengths only depend on the two elementary layers for each aspect, then the coupling tensor reduces even further to $$\hat {C}_{u\alpha \beta }^a=\mathcal {A}_{uu\gamma _1 \gamma _1 \ldots \gamma _{a-1}\gamma _{a-1}\alpha \beta \gamma _{a+1}\gamma _{a+1} \ldots \gamma _d \gamma _d}$$ for any $$\boldsymbol {\gamma } = (\gamma _1, \ldots , \gamma _d)$$. In layer-coupled multilayer networks (which are also necessarily diagonal, by definition), the couplings are independent of the node: $$\mathcal {A}_{uu \boldsymbol {\alpha }\boldsymbol {\beta }}=\mathcal {A}_{vv \boldsymbol {\alpha }\boldsymbol {\beta }}$$ for all $$u,v,\boldsymbol {\alpha },\boldsymbol {\beta }$$. This makes it possible to drop the node indices in the coupling tensors and write $$\mathcal {C}_{\boldsymbol {\alpha }\boldsymbol {\beta }}=\mathcal {C}_{u \boldsymbol {\alpha }\boldsymbol {\beta }}$$, $$C_{\boldsymbol {\alpha }\beta }^a=C_{u\boldsymbol {\alpha }\beta }^a$$ and $$\hat {C}_{\alpha \beta }^a=\hat {C}_{u\alpha \beta }^a$$ for any $$u$$. If such a multilayer network has no couplings (respectively, categorical couplings), then all of the coupling-tensor elements have the value $$\mathcal {C}_{\boldsymbol {\alpha }\boldsymbol {\beta }}=0$$ (respectively, $$\mathcal {C}_{\boldsymbol {\alpha }\boldsymbol {\beta }}=1$$). The network is then completely defined by the intra-layer adjacency tensor, which has elements $$\mathcal {A}_{uv \boldsymbol {\alpha }}$$. If $$d=1$$, this yields a rank-3 tensor, which has been used previously to represent this type of multilayer network [88,89,143,144].

#### Tensor flattening.

One can reduce the number of aspects of an adjacency tensor by combining aspects $$i$$ and $$j$$ into a new aspect $$h$$. That is, we map a node-aligned multilayer network $$M=(V_M,E_M,V,\textbf {L})$$ with $$d$$ aspects into a new multilayer network $$M^\prime =(V_{M^\prime },E_{M^\prime },V,\textbf {L}^\prime)$$ with $$d -1$$ aspects, such that the total number of elements in the corresponding tensors is retained: $$L_h^\prime =L_i \times L_j$$, so $$|L_h^\prime |=|L_i| |L_j|$$. There is a bijective mapping between the elements of the old and new tensors, which is why we have noted that the essential ‘new physics’ in multilayer networks already occurs when there is a single aspect. The above mapping process is often called *flattening* and is sometimes also known as ‘unfolding’ or ‘matricization’. Without loss of generality, suppose that one flattens aspects $$d-1$$ and $$d$$ of an adjacency tensor $$\mathcal {A}$$ to obtain a new tensor $$\mathcal {A}^\prime $$. Again, without loss of generality, we denote the layers using integers starting from $$1$$. With this labelling convention, the corresponding mapping is $$\mathcal {A}_{uv\alpha _1\beta _1 \ldots \alpha _d \beta _d}=\mathcal {A}^\prime _{uv \alpha _1 \beta _1 \ldots ((\alpha _{d-1}-1)|L_d|+\alpha _d)((\beta _{d-1}-1)|L_d|+\beta _d)}$$. To reduce the number of aspects further, one can again apply such a mapping (and it can be desirable to continue this process until one obtains a tensor with $$d = 1$$ aspects).

Flattening tensors can be useful conceptually, and it is also often very convenient when writing software implementations of algorithms [43,145]. The representation of multilayer networks using supra-adjacencies (see Refs. [136,81,137,146] and Section 2.3) is an extreme case of tensor flattening in which one constructs a network with $$d = 0$$ aspects by combining all of the layer aspects and node indices to obtain additional node indices. See Fig. 4(b) for an example of a supra-adjacency matrix.

### Supra-adjacency representation

The adjacency-matrix representation for monoplex networks is powerful because one can exploit the numerous tools, methods and theoretical results that have been developed for matrices. To get access to these tools for investigations of multilayer networks, one can represent such networks using supra-adjacency matrices [136,81,137,146]. As illustrated in Fig. 2, the adjacency matrix of a ‘supra-graph’ corresponds to the supra-adjacency matrix of a multilayer network. (Other ‘supra-matrices’ are defined analogously.) A supra-adjacency representation has already yielded interesting insights for processes such as diffusion [136,81], epidemic spreading [119,147] and synchronizability [81]. In such studies, it is often helpful to investigate the properties of so-called ‘supra-Laplacian matrices’. Supra-adjacency matrices are also convenient for describing walks on multilayer networks [137]. An additional advantage of supra-adjacency matrices over adjacency tensors is that they provide a natural way to represent multilayer networks that are not node-aligned without having to append empty nodes. However, this boon comes with a cost: one must flatten a multilayer network to obtain a supra-adjacency matrix and one thereby loses some of the information about the aspects. Partitioning a network's edge set into intra-layer edges, inter-layer edges and coupling edges makes it possible to retain some of this information. Supra-adjacency matrices, intra-layer supra-adjacency matrices, inter-layer supra-adjacency matrices and coupling supra-adjacency matrices are the adjacency matrices that correspond, respectively, to the graphs $$G_M$$, $$G_A$$, $$G_C$$ and $$G_{\skew {3}\tilde {C}}$$ that we defined in Section 2.1. Alternatively, for node-aligned multilayer networks, one can start from the tensor $$\mathcal {A}$$ and then use the flattening process discussed in Section 2.2.2 to transform it into a matrix.

One can derive a supra-Laplacian matrix from a supra-adjacency matrix in a manner that is analogous to the way that one derives a Laplacian matrix from the adjacency matrix of a monoplex graph. For example, the combinatorial supra-Laplacian matrix is $$\textbf {L}_{\textbf {M}} = \textbf {D}_{\textbf {M}} - \textbf {A}_{\textbf {M}}$$, where $$\textbf {D}_{\textbf {M}}$$ is the diagonal supra-matrix that has node-layer strengths (i.e. weighted degrees) along the diagonal and $$\textbf {A}_{\textbf {M}}$$ denotes the supra-adjacency matrix that corresponds to the graph $$G_M$$. Hence, each diagonal entry of the supra-Laplacian $$\textbf {L}_{\textbf {M}}$$ consists of the sum of the corresponding row in the supra-adjacency matrix $$\textbf {A}_{\textbf {M}}$$, and each non-diagonal element of $$\textbf {L}_{\textbf {M}}$$ consists of the corresponding element of $$\textbf {A}_{\textbf {M}}$$ multiplied by $$-1$$. The eigenvalues and eigenvectors of this supra-Laplacian are important indicators of several structural features of the corresponding network, and they also give crucial insights into dynamical processes that evolve on top of it [148,136,141] (see Section 4.6.3). The second smallest eigenvalue and the eigenvector associated to it, which are sometimes called (respectively) the ‘algebraic connectivity’ and ‘Fiedler vector’ of the corresponding network, are very important diagnostics for the structure of a network. For example, the algebraic connectivity of a multilayer network with categorical couplings^{16} has two distinct regimes when examined as a function of the relative strengths of the inter-layer edges and the intra-layer edges [148]. Additionally, there is a discontinuous (i.e. first-order) phase transition — a so-called ‘structural transition’—between the two regimes. In one regime, the algebraic connectivity is independent of the intra-layer adjacency structure, so it is determined by the inter-layer edges. In the other, the algebraic connectivity of the multilayer network is bounded above by a constant multiplied by the algebraic connectivity of the unweighted superposition (see Section 4.1) of the layers. Combinatorial supra-Laplacian matrices have also been used to study a diffusion process on multiplex networks [136]. (See Section 4.6.3 for more on diffusion processes on multiplex networks.) Radicchi [141] studied the spectrum of the normalized supra-Laplacian matrix $${\hat {\textbf {L}}_{\textbf {M}}} = \textbf {I} - \textbf {D}_{\textbf {M}}^{-1/2} \textbf {A}_{\textbf {M}} \textbf {D}_{\textbf {M}}^{-1/2}$$ (where **I** is an identity matrix) on two-layer interconnected networks that were generated using a generalized configuration model that includes correlations between intra-layer degrees and inter-layer degrees (see Section 4.4). Similar to Ref. [148], Radicchi varied the relative strengths of the inter-layer edges and the intra-layer edges, and he observed qualitatively different behaviour for the eigen-spectrum of the normalized supra-Laplacian for different values of the relative strengths.

All calculations that one does using supra-adjacency matrices can also be done using adjacency tensors by defining a tensor multiplication that mimics the supra-adjacency multiplication. To see this, we define a *flattening function*$$f: \mathbb {R}^{|V| \times |V| \times |\textbf {L}_1| \times |\textbf {L}_1| \times \cdots \times |\textbf {L}_d| \times |\textbf {L}_d|} \rightarrow \mathbb {R}^{|V| \prod _{a=1}^d |\textbf {L}_a|} \times \mathbb {R}^{|V| \prod _{a=1}^d |\textbf {L}_a|}$$ that maps weighted adjacency tensors to weighted supra-adjacency matrices. The function $$f$$ is bijective, its inverse $$f^{-1}$$ is well defined [$$\mathcal {A}=f^{-1}(f(\mathcal {A}))$$] and both $$f$$ and $$f^{-1}$$ are linear. We can now define a tensor multiplication $$\times _f$$ for the class of tensors that are spanned by the adjacency tensors of the multilayer networks: $$\mathcal {A} \times _f \mathcal {B} = f^{-1}(f(\mathcal {A}) \cdot f(\mathcal {B}))$$, where $$\cdot $$ is ordinary matrix multiplication. One can trivially (and usefully) extend the flattening function $$f$$ and the tensor multiplication $$\times _f$$ to include vectors in the vector space in which the supra-adjacency matrices operate. This makes it possible, for example, to calculate the number of walks of length $$N$$ that start from node-layer $$(u, \boldsymbol {\alpha })$$ and end at node-layer $$(v$$,$$\boldsymbol {\beta })$$ in the multilayer network using the formula $$(\underbrace {\mathcal {A} \times _f \cdots \times _f \mathcal {A}}_{N})_{uv\boldsymbol {\alpha }\boldsymbol {\beta }}$$.

### Node-coloured-networks, interconnected networks, interdependent networks and networks of networks

Discussions of node-coloured networks and similar structures have concentrated mostly on spreading processes, cascading failures and network models (see Section 4). In this section, we present the basic structural notions for these types of multilayer networks.

In *interdependent networks*, nodes in two or more monoplex networks are adjacent to each other via edges that are called *dependency edges*. For example, one can construe an electrical grid and a computer network as a pair of interdependent networks, as the proper function of a router in the computer network can depend on a power station and vice versa [57]. Similarly, *interconnected networks*, *interacting networks* and *networks of networks* are sets of networks in which some of the nodes from the various networks are adjacent to each other, but the edges that connect different networks need not indicate dependency relations [118,119,56,129,126]. If the connections in interdependent networks and similar structures are limited in a certain way, then there is a relationship between them and multiplex networks (see the discussion at the end of Section 2.5). In *multitype networks* and *heterogeneous networks* [49,50], all of the nodes are labelled with some ‘type’ and they can be adjacent to nodes that are labelled with either the same or a different type. For example, the nodes in social multitype networks might be labelled with demographic characteristics such as sex, age and ethnic group [64,115]. In Ref. [133], *heterogeneous information networks* were defined as graphs in which each node has a distinct type.^{17} In the types of multilayer networks that we have discussed in this paragraph, one can also think of each layer as a module (i.e. a community). This can be convenient for purposes like defining random-graph ensembles [149].

The multilayer networks that we have discussed above are equivalent to *node-coloured networks* [115,65], although these various frameworks were formulated with different ideas in mind. (Note that we are using the word ‘colour’ in a very general sense; in particular, two nodes of the same colour *are* allowed to be adjacent. This type of ‘colour’ is really a label, which is a usage along the lines of role-assignment problems [150].) Node-coloured networks are graphs in which each node has exactly one colour: $$G_c=(V_c,E_c,C,\chi)$$, where $$V_c$$ and $$E_c$$ are the nodes and edges, $$C$$ is the set of possible ‘colours’ (where each colour is a possible categorical label for the nodes), and $$\chi : V_c \to C$$ is a function that indicates the colour of each node. For multitype networks and heterogeneous networks, the mapping to node-coloured networks is obvious, as each type is now called a ‘colour’. For interdependent networks and networks of networks (and related frameworks), one needs to map the networks into a flattened graph and then assign colours to nodes according to the subnetwork to which each node belongs.

One can represent node-coloured graphs using our multilayer-network framework with $$d=1$$ by considering each layer as a colour. That is, we let $$V=V_c$$, $$L=C$$, $$V_M=\{ (u,c) \in V \times L | \chi (u)=c \} $$ and $$E_M=\{ ((u,c_1),(v,c_2)) \in V_M \times V_M | (u,v) \in E_c \} $$. See Fig. 5 for an example of such a mapping. Because each node has only a single colour, this multilayer network is disjoint, but it does not have any further restrictions (i.e. constraints). Alternatively, if it is not important to preserve the node names, one can choose to rename the nodes such that the nodes in each layer start from 1. To do this, we define a function $$\Upsilon : V_c \rightarrow \{ 1, \dots ,n_c \}$$ that names each node using an integer between 1 and the maximum number of nodes $$n_c$$ with the same colour in such a way that two nodes of the same colour do not share the same name [i.e. $$\Upsilon (u)=\Upsilon (v) \implies \chi (u) \neq \chi (v)$$)]. It then follows that $$V=\{ 1, \dots ,n_c \} $$, $$L=C$$, $$V_M=\{ (\Upsilon (u),c) \in V \times L | \chi (u)=c, u \in V_c \} $$ and $$E_M=\{ ((\Upsilon (u),c_1),(\Upsilon (v),c_2)) \in V_M \times V_M | (u,v) \in E_c \} $$. This mapping is illustrated in Fig. 5(c). For node-coloured networks that are constrained in a certain way, it is possible to do such a mapping in a manner that guarantees that the resulting multilayer network has diagonal couplings. (This is useful, for example, for studies of interdependent networks [57,124,126,151–153].) See the discussion at the end of Section 2.5.

### Multiplex networks and multirelational networks

One can define a *multiplex network*, a *multirelational network* and similar types of multilayer networks as a sequence of graphs [2,83,72,73,75]: $$\{ G_{\alpha } \}_{\alpha =1}^b = \{(V_{\alpha },E_{\alpha })\}_{\alpha =1}^b$$, where $$E_{\alpha } \subseteq V_{\alpha } \times V_{\alpha }$$ is the set of edges and $$\alpha $$ indexes the graphs. Usually, the node sets are the same across the different layers (i.e. $$V_{\alpha }=V_{\beta }$$ for all $$\alpha ,\beta $$), or they at least share some nodes (i.e. $$\bigcap _{\alpha =1}^b V_{\alpha } \neq \emptyset $$). Alternatively, one can define multiplex networks as *edge-coloured multigraphs*, which are networks with multiple types of edges. One defines an edge-coloured multigraph as a triple $$G_e=(V,E,C)$$, where $$V$$ is the node set, $$C$$ is the colour set (which is used for labelling the type of edge) and $$E \subseteq V \times V \times C$$ is the edge set. We use a general definition of a ‘colour’ as a label, so edges that are incident to the same node are allowed to have the same colour. In this definition of edge-coloured multigraphs, a pair of nodes cannot be adjacent to each other via multiple edges of the same colour. One can use edge-coloured multigraphs to represent a sequence of graphs in which all of the node sets are the same (i.e. $$V_\alpha =V_\beta =V$$ for all $$\alpha ,\beta $$) by associating each graph with a unique colour [154,99].

One can map structures that amount to sequences of graphs to our multilayer-network framework by mapping each graph in the sequence to a single intra-layer network. For an edge-coloured multigraph, each edge colour corresponds to a layer in a multilayer network. See Fig. 3 for an example of a multilayer-network representation and its associated edge-coloured multigraph. This type of mapping gives only the intra-layer edges. However, in the literature, it is usually assumed implicitly that nodes are somehow coupled to their counterparts in different layers. Our multilayer framework can incorporate explicit adjacencies of a node with itself across multiple layers, and we use such inter-layer edges to represent the coupling structure of layers (similar to Ref. [101]). Thus far, it has been very common to assume that nodes are adjacent in an identical manner to each of their counterparts in the different layers, which corresponds to categorical inter-layer couplings in the general multilayer-network framework. The other well-known type of coupling is *ordinal* coupling, in which the layers are ordered and nodes are adjacent only to their counterparts in consecutive (‘adjacent’) layers. Ordinal coupling arises, for example, when temporal networks are mapped into a multilayer-network framework. This was discussed several years ago in Ref. [101] (also see the discussion in Ref. [67]), and a similar idea was used very recently in Ref. [155]. Although categorical and ordinal couplings are the most common situations, they are not the only ones. For example, for $$d = 1$$ aspect, one can also represent situations with more general inter-layer couplings either by using fourth-order tensors [67] or by using both block-diagonal and off-block-diagonal third-order tensors [101,103] (see Section 2.2). By employing higher-order tensors, similar representations are possible for any number of aspects.

In the rest of this paper, we will use the term ‘multiplex network’ in a manner that is similar to Ref. [67]: we consider all diagonally coupled multilayer networks in which each layer shares at least one node with some other layer in the network to be multiplex networks. That is, we include multilayer networks that are not node-aligned in our definition of multiplex networks, but we leave out layer-disjoint networks. In several respects, our use of the term ‘multiplex’ is thus more general than its typical usage in the literature thus far (see Table 1). For example, we allow temporal networks (which almost always have ordinal couplings) to be construed as multiplex networks. Importantly, we also do not require all nodes to exist on every layer in a multiplex network. Additionally, this definition of multiplex networks also includes multilayer networks with more than one aspect, because it is natural to represent certain multiplex structures using multilayer networks with more than one type (i.e. aspect) of layer. For example, in *cognitive social structures* [21], one aspect can be used for the set of layers that contain each person's perception of a social network and the other can be used to represent different types of relationships in these social networks. See Fig. 4 for an illustration. Furthermore, as discussed in Section 2.7.1, temporal networks with multiple types of edges can be considered as multiplex networks with two aspects.

An archetypical example of a multiplex network is a social network in which the different layers (i.e. different edge labels) represent different types of social relationships [2,156–158]. For example, one can place friendship ties, family ties and coworker ties in different layers. Other examples include gene co-expression networks, protein interaction networks and transportation networks. In a gene co-expression network, each layer can represent a different tissue type or environment [144]. In a protein-protein interaction network, each layer can include interactions from one of the many possible experimental protocols. There are many types of multiplex transportation networks. For example, one can construct a multiplex air-transportation network whose individual layers contain routes from a single airline [159,68], a shipping network with different types of vessels in different layers [160], or a ground-transportation network in which each layer includes edges from a single mode of transportation. In other types of multiplex transportation networks, such as a city's metropolitan system, each layer corresponds to a different ‘line’ (e.g. the Tube in London has the Circle Line, the District Line and many others) [161,137], though this example includes very different sets of nodes in different layers. See Section 3 for more discussion and examples of real-world multiplex networks.

In theory, one can also map the structures that are represented naturally as multiplex networks to the node-coloured networks that we discussed in Section 2.4. For example, consider a transportation network in which cities are adjacent via railroad lines, airplane routes and shipping routes. For simplicity, let us assume that each city has a single airport, a single train station and a single port. When representing this situation using a node-coloured network, the colours correspond to the three modes of transportation, and the nodes represent the airports, train stations and ports (which are adjacent to each other if they are in the same city). In the multiplex-network representation of this same network, the nodes are the cities, and the three layers represent the three transportation modes. In our multilayer-network representation, these two ways of representing the data are almost but not quite equivalent. The difference is that the multiplex-network representation has node sets that are same across the layers, whereas the node-coloured-network representation has disjoint node sets. In this type of node-coloured network, any path^{18} whose edges are between nodes of a different colour (i.e. inter-layer edges, which correspond to intra-city edges in the above example) cannot contain more than one node of any given colour. That is, in the above example, it is not possible to (for example) start from an airport and go to another airport by following only intra-city edges. In undirected multiplex networks with only two layers, this condition is equivalent to enforcing both ‘uniqueness’ and ‘no-feedback’ conditions [126] for networks of networks (i.e. interdependent networks). Several authors have exploited this connection between the two representations to map dynamics on networks of networks to dynamics on multiplex networks [124,151–153] (see Section 4.6.2). Note that some interdependent networks include nodes without any inter-layer edges—these are sometimes called ‘partially interdependent networks’ to distinguish them from ‘fully interdependent networks’ (see, e.g. Ref. [151])—and one can use a similar argument to map them to multiplex networks that are not node-aligned.

### Hypergraphs

In a *hypergraph*$$H=(\mathcal {V},\mathcal {E})$$, each *hyperedge* in the hyperedge set $$\mathcal {E}$$ can connect any (positive) number of nodes (rather than being restricted to connecting exactly two nodes, as in a graph without self-edges), so $$\mathcal {E} \subset \mathcal {P}^\prime (\mathcal {V}) \subset \mathcal {P}(\mathcal {V})$$, where $$\mathcal {P}(\mathcal {V})$$ denotes the power set (i.e. set of subsets) of $$\mathcal {V}$$ and $$\mathcal {P}^\prime (\mathcal {V})$$ denotes the power set with subsets of cardinality larger than $$1$$. (As with graphs, we are excluding self-edges, which correspond to subsets of cardinality $$1$$. One can, of course, include self-edges if one wants.) One constructs a weighted hypergraph by associating a non-negative number to each hyperedge, and one constructs a directed hypergraph by associating each hyperedge with a sequence of nodes rather than a set of nodes. Any undirected hypergraph can be mapped to a categorically-coupled multilayer network: each hyperedge corresponds to a single layer in which all of the nodes in the hyperedge form a clique [112]. Additionally, note that there exists a mapping between hypergraphs (with the possibility for multiedges) and bipartite graphs (see Section 2.8): one set of nodes in the bipartite graph represents the nodes of the hypergraph, and the other set represents the edges.

A *$$k$$-uniform hypergraph* is a hypergraph in which the cardinality of each hyperedge is exactly $$k$$, so each hyperedge represents a connection among exactly $$k$$ nodes. This is relevant for folksonomies [162,163] and other applications (e.g. finding interologs in protein interactions) [164]. One can represent weighted, directed, $$k$$-uniform hypergraphs using adjacency tensors. Let the element $$\mathcal {W}_{u_1 \ldots u_k}^H$$ be the weight of any existing hyperedge $$(u_1, \dots , u_k) \in \mathcal {E}$$, and assign a value of $$0$$ to any hyperedge that is not in the hyperedge set $$\mathcal {E}$$. This connection is useful, for example, for investigating spectral theory in adjacency tensors of hypergraphs [165–167].

Multiplex networks have been studied by mapping them into directed $$3$$-uniform hypergraphs [164]. A node-aligned multiplex network with node set $$V$$ and layer set $$L$$ is mapped to a 3-uniform hypergraph $$H$$ such that the node set in the hypergraph is $$\mathcal {V}=V \cup L$$ and $$(u,v,\alpha) \in \mathcal {E}$$ if and only if there is an edge between node-layers $$(u,\alpha)$$ and $$(v,\alpha)$$. (That is, there needs to be an intra-layer edge between nodes $$u$$ and $$v$$ in layer $$\alpha $$.) One can construct similar maps for multiplex networks with an arbitrary number of aspects. It is also possible to define a mapping in the reverse direction (i.e. from $$k$$-uniform hypergraphs to node-aligned multiplex networks). Given $$H$$, one defines a multiplex network with $$V=\mathcal {V}$$ and $$L_a=\mathcal {V}$$ for all $$a \in \{ 1 ,\dots , k \}$$.

### Ordinal couplings and temporal networks

One can represent a temporal network as a set of events or an ordered sequence of graphs [16,17].^{19} This is also the case for temporal networks that are constructed from similarities in coupled time series [103], which can arise either from experimental data or from the output of a model. In the case of event sets, suppose that events are triples $$e=(u,v,t)$$, where $$u,v \in V$$ are nodes and $$t \in T$$ is a time stamp of an event. An event-based temporal network is equivalent to an edge-coloured multigraph in which the set of colours is the set of possible time stamps (i.e. $$C=T$$) and the edges $$E$$ are the events in the network. The only difference between this structure and the edge-coloured multigraphs that we defined in Section 2.5 is that the set of ‘colours’ is ordered instead of categorical.

When using our general multilayer-network framework, one can be explicit about the fact that events or intra-layer graphs are ordered [66]. In this case, two identical nodes from different layers are adjacent via an inter-layer edge only if the layers are next to each other in the sequence. Furthermore, time's arrow can be incorporated into the network structure by using directed edges between corresponding nodes in different layers. One can also allow a generalized ordinal coupling that includes a horizon $$h$$ by considering not only neighbouring layers but all layers that are within $$h$$ steps. In the context of temporal networks, one can view this construction as a time horizon.

The multilayer-network framework that we discussed in Section 2 allows non-diagonal couplings. A recent study of epidemic dynamics on temporal networks utilized a multilayer framework with non-diagonal couplings [168]. Additionally, one can use the edges that are associated with such couplings to represent delays in events. For example, consider an airline network in which a flight from airport $$u$$ leaves at time $$t$$ and arrives at airport $$v$$ at time $$t+t_d$$. In an event-based temporal-network representation, this flight is a quadruplet $$(u,v,t,t_d)$$, where the additional element $$t_d$$ represents a delay. In our multilayer-network framework, one can represent the associated edge as an inter-layer edge $$((u,t),(v,t+t_d))$$.

Our general multilayer-network framework also allows continuous time, whereas the tensor and the supra-adjacency matrix representations assume that there are a finite number of layers. (See Ref. [169] for a discussion about using continuous versus discrete time for studying temporal networks.) Moreover, in some contexts, it might be desirable to distinguish between events that have a long duration from several consecutive events. For example, consider a mobile-phone calling network in which an event represents a call between two people, and edges in a layer correspond to all of the calls that take place within 1 min. Clearly, two consecutive calls that each last 1 min is a different situation from a single call that lasts 2 min, but one can represent each of these cases by using two edges in two consecutive layers. If there is a need to distinguish between these two cases, then one should add an intervening ‘connecting’ layers between each pair of time points. In such a representation, the two consecutive calls would not include an edge in the connecting layer, whereas the continuous call would include such an edge.

#### Networks with both ordinal and categorical aspects.

It is also possible to use our multilayer-network framework to represent multiplex networks that both have multiple types of edges and are temporal. In this case, $$d=2$$, and one of the aspects is categorical and the other is ordinal. When using a tensor representation, this leads to sixth-order tensors (or fourth-order tensors if one chooses to use a lower-order representation; see Section 2.2.1).

Examples of networks with both ordinal and categorical networks include transportation networks with multiple modes of travel (e.g. flights, trains etc.) and different departure times. Additionally, social networks and communication networks are typically both multiplex and temporal. (See Fig. 3 for an example of such social network.) There are currently very few studies that construct networks that explicitly incorporate both time stamps and edge types. One of them is the international trade networks in Refs. [84,85]. In this example, each layer corresponds to 1 year (out of 12 years) in one category (out of 97 categories). Other studies have examined data sets with both temporal and multiplex features (e.g. the shipping data set in Ref. [160] as well as numerous social networks, such as the ones that were studied in Refs. [156,157,170]), but such investigations typically have not assembled their data into a multilayer network that is both multiplex and temporal. An old data set that is technically both multiplex and temporal is discussed in Ref. [171], but it has only two time points and two types of edges.

Some recent papers have explicitly incorporated both time-dependence and multiplexity. For example, Ref. [172] examined the dynamics of a stochastic actor-oriented model in a multiplex network in which a bipartite network (with actors adjacent to groups) coevolves with a unipartite multiplex network (which encapsulates interactions between the actors). Additionally, Oselio *et al.* [173] used stochastic blockmodelling to develop a method for inference in time-dependent multiplex networks.

### Other types of networks and graphs

We now briefly discuss $$k$$-partite graphs, networks with both coloured nodes and coloured edges, multilevel networks and some other networks structures.

Define a *$$k$$-partite network* as a tuple $$G_k=(\mathcal {V}_k,E_k)$$, where $$\mathcal {V}_k=\{ V_i \}_{i=1}^k$$ is a collection of $$k$$ pairwise disjoint sets of nodes (i.e. $$V_i \cap V_j = \emptyset $$ if $$i \neq j$$), such that each set $$V_i$$ represents nodes of a certain type; and $$E_k \subset \bigcup _{i=1}^k V_i \times \bigcup _{i=1}^k V_i $$ is the set of edges, where edges are not allowed between nodes of the same type (i.e. $$u,v \in V_i \implies (u,v) \notin E_k$$ for any $$i$$). Clearly, a $$k$$-partite graph is a special case of the node-coloured graphs that we discussed in Section 2.4. Each node type corresponds to a colour, and the colouring is a proper node-colouring, so two nodes of the same colour cannot be incident to the same edge. ‘Bipartite’ (i.e. $$2$$-partite) networks have received considerable attention [15,174,1,2], but tripartite and more general $$k$$-partite (aka, ‘multi-mode’ or ‘multipartite’) networks are also interesting, even though they have not been studied with close to the same intensity. However, they have been used to investigate various social systems [175–177]. Additionally, some scholars have examined multiplex bipartite networks and their unipartite projections to multiplex networks [76,178]. In such a projected network, there are $$\binom {b+1}{2}$$ layers—one for each combination $$(\alpha ,\beta)$$ of layers of the bipartite network (including combinations in which the same layer is included twice, for which $$\alpha =\beta $$). There is an edge between a pair of nodes in layer $$(\alpha ,\beta)$$ if one node is adjacent to a third node in the bipartite network via an edge of type $$\alpha $$ and the other node is adjacent to that third node via an edge of type $$\beta $$. Allard *et al.* [179] examined bond percolation (see Section 4.6.1) on networks that are constructed via unipartite projections from node-coloured bipartite graphs that are generated using a (generalized) configuration model. The projected networks include colours for both nodes and edges. Allard *et al.* also considered the projection of node-coloured bipartite graphs to various other types of multilayer networks.

We discussed multilayer networks with multiple types of edges in Section 2.5, and we discussed multilayer networks with multiple types of nodes in Section 2.4. It is also possible for networks to include both of these features. A natural way to map such a network to our general multilayer-network framework is to consider node colours and edge colours as separate aspects. Such multilayer networks have been examined under the monickers of *heterogeneous information networks* [130–132,180] and *coupled-cell networks with multiple arrows* [181,182]. Other authors have studied similar structures called *meta-networks* (and associated *meta-matrices*) [34,134,135] Additionally, see the research on topics such as ‘semantic graphs’ [183] and ‘attributed relational graphs’ [184,185]. Some of aforementioned multilayer networks possess an additional constraint: if two edges have the same colour, then the pair of nodes that are incident to one of these edges must share the same colour combination as the pair of nodes that are incident to the other edge [130,132,51,182]. (For example, consider two red edges. They can both be incident to one blue node and one green node, but it is not permissible for one edge to be incident to two blue nodes but the other to be incident to one blue node and one green node.)

*Multilevel networks* are based on the application of ideas from ‘multilevel analysis’ [186] to networks [187,188]. One can use the general framework for multilevel networks that is described in Ref. [110] to represent networks in which nodes can have any finite number of types (i.e. ‘levels’) and in which there can be edges between nodes of the same type or between nodes of ‘adjacent’ types. Reference [110] included a mapping of all previous multilevel networks (see, e.g., Refs. [189,187,111,190]) to a two-level hierarchical network that is a special case of their general multilevel-network framework. They used the term ‘macro-level network’ for one level, ‘micro-level network’ for the other level, and ‘meso-level network’ for a network that consists of micro-level nodes, macro-level nodes and exclusively inter-level edges. For example, a social network of researchers (micro-level) and a resource-exchange network between laboratories (macro-level) to which the researchers belong constitutes a multilevel network with two levels [111,110]. In many cases, the structure of multilevel networks is restricted even further—for example, by allowing only a single inter-layer edge from each micro-level node (see the discussion in Ref. [110] and references therein). Multilevel networks fit our multilayer-network framework naturally, as each level is a layer. The resulting structure amounts to a node-coloured network in which only inter-layer edges between ‘adjacent’ colours (i.e. consecutive layers) are allowed. Clearly, two-level networks are equivalent to node-coloured networks with two colours.

A type of multilayer network of particular relevance for telecommunication networks (such as the internet) is a ‘hierarchical multilayer network’, in which the bottom layer constitutes a ‘physical’ network and the remaining layers are ‘virtual layers’ that operate on top of the physical layer [60,61,191–193]. Such networks are typically similar to interdependent networks in that (1) a node in a given layer is dependent on a node in the layer below and (2) a node in a given layer cannot be adjacent to another node in its own layer unless there is a path via the corresponding nodes in the layer below.

## Empirical multilayer networks

Developing representations and models for multilayer networks is useful for increasing the understanding of the structure and function of multilayer systems, and it can lead to discoveries of new phenomena that cannot be explained using a monoplex-network framework (see Section 4). However, in order to understand how real-world multilayer networks behave and are organized, it is also necessary—indeed, it is crucial—to collect and study empirical data for which such frameworks are appropriate. It is also helpful to develop new visualization tools, data structures^{20} and computational methods [138,139,194] for multilayer networks. In Fig. 6, we show visualizations of two different multilayer networks. In Fig. 6(a), we show the air-transportation network from Ref. [159]; in Fig. 6(b), we show the bank-wiring room network of Ref. [26] (see Section 1).

By design, networks with multiple layers are able to encapsulate a much more detailed description of a system than monoplex networks. This, in turn, yields significant new data-collection challenges, which need to be surmounted for ideas developed for multilayer networks to be genuinely useful for applications. For example, although it is clear that using a single type of edge between a pair of people does not always provide a suitable level of abstraction to study social networks [2], data on empirical social networks is still primarily available in a format that is more suitable for monoplex networks than for multilayer ones. Small multiplex social-network data, which were predominantly collected manually either via questionnaires or by carefully observing a group of people, has been available for decades. (See Ref. [195] for a several examples of such small data sets, including the well-studied Sampson monastery network [196].) More recently, large-scale multiplex social network data sets have been acquired automatically. See Ref. [197] for some example data sets (from air-transportation [159] and Twitter [198,199]). Methods for large-scale data collection include combining data from several social-networking sites [200], observing several different types of interactions between players in a massive multiplayer online game (MMOG) [157], and using mobile-phone billing information to construct networks with calls and text messages considered as separate layers [201]. Most of the large-scale multiplex-network data sets are node-aligned—i.e. all of the nodes are present in all of the layers—but, for example, there is also a large data set that covers relationships via multiple social-networking sites that yields layers that are not node-aligned [202].

Thus far, most empirical studies of multilayer networks have used data that can fit the multiplex-network framework. In Table 2, we give a sample of data sets that have been used in the literature. Importantly, a small number of empirical studies have used other types of multilayer networks: four of the investigated examples of empirical interacting and interdependent networks are interacting power stations and internet servers [55,57], coupled power grids [122], coupled climate networks [129] and interconnected transportation networks such as an airport and a railroad network [212] or an airport and port network [213]. There have also been empirical studies on hierarchical multilevel networks (see Section 2.8), such as a social network between cancer researchers, their affiliations and connections between the affiliations [111]. Nevertheless, the vast majority of studies on interconnected and interdependent networks have been theoretical.

Name | Nodes | Layers | References |
---|---|---|---|

Social networks with multiple types of ties | |||

Baboon social network | Individuals (12) | Interaction types (3) | [97] |

MMOG social network | Players (300000) | Interaction types (6) | [157,203] |

Aarhus computer-science department | Employees (61) | Social activities (5) | [204] |

Friendfeed network | Users (7629) | Social-networking services (3) | [200] |

Online forum social network | Users (1899) | Forum, instant messages (2) | [205] |

Top Noordin terrorist network | Terrorists (78) | Relation type (4) | [75] |

Florentine families | Families (16) | Business, marriage ties (2) | [37,137] |

Facebook (‘Tastes, Ties and Time’) | Users (1640) | Friends, in same group, in same picture (3) | [156] |

Coauthorship networks | |||

DBLP coauthorship (3) | Authors (6771–558800) | Conferences (6–2536) | [95,99,94] |

DBLP coauthorship | Authors (10305) | Publication categories (617) | [89] |

Unipartite projections of bipartite graphs | |||

Youtube users | Users (15088) | Shared activities (5) | [96] |

Extrarandom.pl | Users (4404) | Shared activities (11) | [92] |

Netflix | Movies (13581) | Rating category pairs (3) | [76,178] |

Temporal networks considered as multilayer networks | |||

Enron e-mail | Users (184) | Months (44) | [206] |

World trade | Countries (18) | Years (10) | [206] |

US Senate | Senators (1884) | Congresses (110) | [101,66] |

IMDB | Actors (28042) | Years (10) | [207] |

DBLP Coauthorship | Authors (582179) | Years (55) | [95] |

Layers based on keywords of text on edges | |||

Enron e-mail | Users (500) | Keywords (500) | [208] |

IBM social network | Users (3679) | Keywords (1000) | [208] |

Webpages, anchor text (3) | Web pages (560–$$10^5$$) | Anchor terms (533–39255) | [46,88,47] |

Layers based on different aspects of node similarities | |||

SIAM journals | Articles (5022) | Similarity types (5) | [42] |

ArXiv articles | Articles (30000) | Similarity types (4) | [113] |

Corporate Governance network | Companies (273) | Similarity types (3) | [209] |

Transportation networks | |||

Air-transportation nets (3) | Airports (308–3108) | Airlines (15–530) | [159,68,137] |

London underground | Stations (314) | Lines (14) | [161,137] |

Global cargo ship network | Ports (951) | Ship types (3) | [160] |

Other types and mixed types of networks | |||

Web search queries (2) | Words (131268, 184760) | Rank of clicked result (5, 6) | [93,95,154,94] |

Flickr (2) | Users (1000, 1186895) | Contacts, shared activities (11, 4) | [90,98,93,95] |

DBLP citations (2) | Authors (6848, 10305) | Publication categories (617) | [89,88] |

DBLP proceedings | Authors | Conferences (70) | [50] |

DBLP author network | Authors (424455) | Coauthorship, co-citations (3) | [91] |

Gene co-expression | Genes | Experimental conditions (130) | [144] |

Cognitive social structures (2) | People (21, 48) | People, perceptions (21, 47) | [21,86,137] |

Global terrorist network | Terrorist groups (2509) | Target country (124) | [210] |

International trade network (2) | Countries (15, 162) | Years (3, 12), commodities (15, 97) | [85,84,211] |

Name | Nodes | Layers | References |
---|---|---|---|

Social networks with multiple types of ties | |||

Baboon social network | Individuals (12) | Interaction types (3) | [97] |

MMOG social network | Players (300000) | Interaction types (6) | [157,203] |

Aarhus computer-science department | Employees (61) | Social activities (5) | [204] |

Friendfeed network | Users (7629) | Social-networking services (3) | [200] |

Online forum social network | Users (1899) | Forum, instant messages (2) | [205] |

Top Noordin terrorist network | Terrorists (78) | Relation type (4) | [75] |

Florentine families | Families (16) | Business, marriage ties (2) | [37,137] |

Facebook (‘Tastes, Ties and Time’) | Users (1640) | Friends, in same group, in same picture (3) | [156] |

Coauthorship networks | |||

DBLP coauthorship (3) | Authors (6771–558800) | Conferences (6–2536) | [95,99,94] |

DBLP coauthorship | Authors (10305) | Publication categories (617) | [89] |

Unipartite projections of bipartite graphs | |||

Youtube users | Users (15088) | Shared activities (5) | [96] |

Extrarandom.pl | Users (4404) | Shared activities (11) | [92] |

Netflix | Movies (13581) | Rating category pairs (3) | [76,178] |

Temporal networks considered as multilayer networks | |||

Enron e-mail | Users (184) | Months (44) | [206] |

World trade | Countries (18) | Years (10) | [206] |

US Senate | Senators (1884) | Congresses (110) | [101,66] |

IMDB | Actors (28042) | Years (10) | [207] |

DBLP Coauthorship | Authors (582179) | Years (55) | [95] |

Layers based on keywords of text on edges | |||

Enron e-mail | Users (500) | Keywords (500) | [208] |

IBM social network | Users (3679) | Keywords (1000) | [208] |

Webpages, anchor text (3) | Web pages (560–$$10^5$$) | Anchor terms (533–39255) | [46,88,47] |

Layers based on different aspects of node similarities | |||

SIAM journals | Articles (5022) | Similarity types (5) | [42] |

ArXiv articles | Articles (30000) | Similarity types (4) | [113] |

Corporate Governance network | Companies (273) | Similarity types (3) | [209] |

Transportation networks | |||

Air-transportation nets (3) | Airports (308–3108) | Airlines (15–530) | [159,68,137] |

London underground | Stations (314) | Lines (14) | [161,137] |

Global cargo ship network | Ports (951) | Ship types (3) | [160] |

Other types and mixed types of networks | |||

Web search queries (2) | Words (131268, 184760) | Rank of clicked result (5, 6) | [93,95,154,94] |

Flickr (2) | Users (1000, 1186895) | Contacts, shared activities (11, 4) | [90,98,93,95] |

DBLP citations (2) | Authors (6848, 10305) | Publication categories (617) | [89,88] |

DBLP proceedings | Authors | Conferences (70) | [50] |

DBLP author network | Authors (424455) | Coauthorship, co-citations (3) | [91] |

Gene co-expression | Genes | Experimental conditions (130) | [144] |

Cognitive social structures (2) | People (21, 48) | People, perceptions (21, 47) | [21,86,137] |

Global terrorist network | Terrorist groups (2509) | Target country (124) | [210] |

International trade network (2) | Countries (15, 162) | Years (3, 12), commodities (15, 97) | [85,84,211] |

We categorize the data sets in Table 2 to facilitate exposition and to illustrate different approaches for generating multiplex data. (Obviously, our categorization is not definitive.) For example, one can map a network whose edges are labelled with text to a multiplex network by letting each layer correspond to a keyword that appears in its edge. Additionally, one can construe a system with nodes that belong to multiple different bipartite networks as a multiplex network by letting each layer be a unipartite projection of one of these bipartite networks. As discussed in Section 2.7, most temporal networks can also be examined using a multiplex framework. In Table 2, however, we include only temporal-network examples in which the authors of the articles that used the data explicitly employed a multilayer-network construction.

Most of the multiplex data that has been collected thus far has concentrated on intra-layer networks and has completely disregarded inter-layer connection strengths. Consider, for example, the problem of information diffusion in multiple social-networking websites, which together constitute a multiplex network in which each layer corresponds to a single site. In this type of network, it is typically straightforward to collect data on intra-layer edges, but describing diffusion of information in such systems requires somehow quantifying (ideally, in a direct way from real data) the relative rates at which information moves across layers versus within layers [77]. For example, one can try to estimate transition probabilities between different layers by considering the relative amounts of time that are spent on activities in each layer. Similarly, in a transportation network that includes multiple modes of transportation (e.g. airline networks, railroad networks, and a single city's transportation system), the cost of changing modes needs to be quantified. Such data are often measured—e.g. Transport for London possesses excellent data on the amount of time that it takes to change lines in their metropolitan transportation system [214]—and it is important to collect or acquire such data to construct empirical estimates for inter-layer edge weights.

The identification of inter-layer edges is central to the problem of ‘graph reconciliation’ [215]. For example, given multiple social networks (e.g. Facebook and LinkedIn), one might seek to determine which diagonal inter-layer edges exist. Each such edge connects an individual in one social network with the same individual in another social network, so the graph-reconciliation problem is deeply connected to the problem of de-anonymizing networks [216]. Infamously, such de-anonymization allowed the identification of sensitive information about users who had accounts with both Netflix and the Internet Movie Database (IMDB) [217].

When constructing multilayer networks, it is essential to be creative about determining what constitutes the layers. For example, a multilayer interbank network can include different types of credit relations in different layers [218], a social network can include different types of social relations in different layers [219,220], a Twitter network can include different hashtags in different layers [198,197], a communication network can include interactions that use different languages in different layers, linguistic networks can also include inter-lingual relationships [221], a multilayer brain network can contain different layers for structural and functional networks (or different layers for brain activity that is measured using different modalities, different layers for different experimental subjects and so on), and a structural neuronal network includes connections from both electrical synapses (which are formed at gap junctions) and chemical synapses [222]. Different layers in a multilayer network could also represent how much time people spend doing different activities (with transition probabilities between different activities to yield weights for inter-layer edges), different populations in a metapopulation model for biological epidemics [223,224], the different swinging platforms (where several oscillating metronomes are placed on each platform) in the experiments of Ref. [225] and so on. For some applications, there might be some ‘obvious’ way to identify layers in the construction of a multilayer network, but other times it might not be so obvious. We expect a multilayer-network framework to yield useful insights in both types of situations. Moreover, it is also useful to develop methods to try—given some data—to identify different layers [226–228].

## Models, methods and dynamics

There has been considerable interest in generalizing concepts from monoplex networks to multilayer networks, and it is very important to do so. Ideas that have been examined include (but are not limited to) node degrees and neighbourhoods, connected components, walks, clustering coefficients and transitivity, centrality measures, community structure, and network models. Multilayer structures can have important effects on dynamical systems on networks, and examples such as percolation and spreading processes have been studied in detail to illustrate some of these effects. An increasingly large body of other dynamical processes have also been studied on multilayer networks.

Thus far, almost all of the scientific literature on multilayer networks has concerned networks with only a single aspect. Accordingly, in this section, we assume that a multilayer network has only a single aspect unless we explicitly indicate otherwise. We also typically consider multiplex networks to be node-aligned unless we comment explicitly on this feature.

### Network aggregation: from multiplex networks to monoplex networks

A ‘traditional’ way to examine systems with multiplexity is to construct a monoplex network by aggregating data from the different layers of a multiplex network and then study the resulting monoplex network [67].^{21} This makes it possible to use standard network techniques, and it is sometimes a desirable way to help alleviate issues with ‘noisy’ data. One procedure to construct an aggregated network (which is also known as a superposition network [136], overlapping network [75], or overlay network [67]) is to define edge weights between two nodes in the resulting monoplex network as a linear combination of the weights between those same nodes from each of the layers. This yields a weighted adjacency matrix $$\textbf {W}$$ whose components are $${W}_{uv}=\sum _{\alpha =1}^b m_{\alpha } \mathcal {W}_{uv\alpha }$$, where $$\mathcal {W}$$ is the third-order weighted adjacency tensor of the multiplex network and $$\textbf {m} \in \mathbb {R}^b$$ is a vector that weights the importances of different layers [113]. If information about the relative importances of the layers is not available, then it is typical to set each of the weights to $$1$$ by taking $$\textbf {m} = (1,\dots ,1)$$. If desired, one can also normalize after the aggregation process. For the choice $$\textbf {m} = (1,\dots ,1)$$ and an unweighted multiplex network, the weights in the aggregated network equal the number of different types of edges between pairs of nodes and the weighted network corresponds to a multigraph in which the edge layers are discarded.^{22} It is sometimes possible to indirectly infer the relative importances of layers; for example, one can exploit known community structure of a multilayer network [50,113]. Aggregation of temporal networks [16,17] has analogous issues to (and is often a special case of) multilayer-network aggregation, and one needs to consider event-time statistics in that case [230].

Typically, aggregations of multiplex networks into monoplex networks tend to be performed manually rather than through an explicit mathematical computation. For example, the weighted version of the well-known Zachary Karate Club network [231] was constructed by considering eight different contexts for which there exist relationships between pairs of nodes, although precise representations (e.g. in the form of adjacency matrices) for these eight networks were not reported. In some cases, it is desirable to disregard the weights of a monoplex network that one obtains by aggregating multiplex networks and to instead examine an unweighted monoplex network [159,83,68,75] (References [112,83] used the term ‘projected network’ to describe this situation, though it is preferable to use words like ‘projection’ in a more mathematical sense—such as in a one-mode projection of a bipartite network that is obtained by multiplying an adjacency matrix with its transpose [1].) In this case, $$\textbf {m}$$ is a binary vector and the projection results in an unweighted adjacency matrix: $$m_{\alpha }=1$$ if layer $$\alpha $$ is included in the aggregation (i.e. ‘projection’) and $$m_{\alpha }=0$$ if it is not. In this approach, there are $$2^b-1$$ nontrivial ways ($$1$$ for each combination of layers) of performing an aggregation, and the aggregation process can be construed as taking a union of the edges of a selected set of layers. A complementary approach was taken by Ref. [232], who instead considered intersections of intra-layer edge sets with different combinations of layers. Additionally, see the definition of ‘multi-edge’ in Section 4.2.1.

An aggregation process can discard a lot of valuable information about an inherently multiplex system. For example, in the international trade network, the aggregated network that uses the total amount of trade between countries as edge weights cannot capture the richness of the structures in the multiplex network, for which each layer represents trading of different categories of products [84,85]. Another interesting example is an air-transportation multiplex network, for which each layer contains edges that represent connections between airports that arise from the flights of a single airline. In this example, the layers that correspond to low-cost airlines have qualitatively different properties from those that correspond to major airlines [159]. It was also demonstrated recently that social networks and transportation networks include multiple types of transitivity, which cannot be inferred by exclusively studying a weighted network obtained from aggregation [137]. As these examples illustrate, aggregation can distort the properties of multiplex networks. In addition, in multilayer networks with nontrivial inter-layer coupling, the information related to transitions between layers (i.e. the inter-layer edges) disappears as a result of aggregation processes because self-edges cannot account for such transitions.^{23} Recent research has investigated how to aggregate multilayer networks in a way that attempts to minimize information loss [234].

Despite the loss of information inherent in an aggregation process, there can be interesting relationships between inherently multiplex diagnostics and corresponding diagnostics calculated for aggregated networks. For example, the eigenvalues of the supra-adjacency matrix and combinatorial supra-Laplacian matrix of a node-aligned multiplex network (see Section 2.3) interlace with the eigenvalues of the weighted adjacency matrix and combinatorial Laplacian matrix^{24} of the aggregated network that one constructs by counting the number of edges between pairs of nodes and dividing by the number of layers [146]. (Additionally, see Ref. [235] from the control-theory literature.) Similar results also hold for multiplex networks that are not node-aligned, but there are several ways of doing the averaging. Furthermore, a multiplex clustering coefficient (see Section 4.2.3) with certain parameter values reduces to what one would obtain by using a weighted clustering coefficient on a weighted monoplex network [137].

### Diagnostics for multilayer networks

We now examine some of the diagnostics that have been developed for multilayer networks. All of them are defined for multilayer networks with only a single aspect (i.e. $$d=1$$), and most of them have been defined in the context of multiplex networks.

#### Node degree and neighbourhood.

In monoplex networks that are undirected and unweighted, a node's ‘degree’ (i.e. degree centrality) gives the number of nodes that are adjacent to it (i.e. the number of its immediate neighbours). Equivalently, a node's degree is the number of edges that are incident to it, and a focal node's ‘neighbourhood’ is the set of nodes that are reached by following those incident edges. One can generalize the notion of degree for directed networks to obtain in-degree and out-degree. These indicate, respectively, the number of incoming and outgoing edges. The weighted degree, or strength, of a weighted network is given by the sum of the weights of all edges that are incident to a node [1,13]. There are several ways to generalize the notions of degree and neighbourhood for multiplex networks [112,75], although one of course obtains the usual definitions if one considers only a single intra-layer network at a time.

The simplest way to generalize the concepts of degree and neighbourhood for multiplex networks is to use network aggregation. One can then define the degree as the number of edges of any type that are incident to a node and the neighbourhood as the set of nodes that can be reached from a focal node by following any of those edges. Alternatively, one can threshold an aggregated network such that two nodes are considered to be adjacent if and only if the number of edges that connect them in a multiplex network is larger than some threshold value. This approach was taken by Refs. [90,236,92], whose definition of the neighbourhood of a node in a directed multiplex network is based on counting the number of different types of edges and taking into account the directions of the edges. References [90,236,92] also defined several versions of degree centrality that use different normalization factors.

Similar to aggregating multiplex networks themselves, for which one can use any combination of layers in an aggregation process, it is possible to define degree and neighbourhood in terms of a focal node and any subset of the layers. This makes it possible to define an aggregated multiplex degree or neighbourhood of a node in $$2^{b}$$ different ways. This approach was taken in Refs. [154,93,95], who defined the *neighbours*$$\Gamma (u,D)$$ of a node $$u$$ given a subset $$D \subseteq L$$ of the layers as the set of nodes that can be reached by following any edge that starts from node $$u$$ in any of the layers in $$D$$. Additionally, the *neighbours-XOR* of node $$u$$ is $$\Gamma _{\mathrm {XOR}}(u,D)=\Gamma (u,D) \setminus \Gamma (u,L \setminus D)$$, which gives the set of nodes that one can reach by following any edge that is incident to node $$u$$ in any of the layers in $$D$$ but which are unreachable if one starts in any layer that is not in the set $$D$$. These definitions give a starting point for developing other measures to quantify, for example, the level of redundancy in the layers of a multiplex network. Bianconi defined a notion of a ‘multidegree’ starting from the concept of a ‘multi-edge’ [73]. One defines a *multi-edge* using the binary vector $$\textbf {m} \in \{ 0,1 \}^{b}$$ (which plays a similar role as $$\textbf {m} \in \mathbb {R}^{b}$$ above), which gives the set of *node pairs*$$(u,v)$$ (i.e. the set of multi-edges) for which a pair of nodes $$u$$ and $$v$$ are adjacent (when $${m}_{\alpha }=1$$) and not adjacent (when $${m}_{\alpha }=0$$) on layer $$\alpha $$. The *multidegree*$$k_{u}^\textbf {m}$$ of node $$u$$ is then the number of multi-edges that node $$u$$ has with vector $$\textbf {m}$$, and the *multistrength*$$s_{u,\alpha }^\textbf {m}$$ is the sum of the weights of those edges in the intra-layer network of layer $$\alpha $$ [237]. The number of different vectors $$\textbf {m}$$ (and, equivalently, the number of possible subsets $$D \subseteq L)$$ grows exponentially with the number of layers. If the number of layers is large, such growth can cause both computational problems and difficulties with interpretation of results. To help alleviate these problems, it is useful to define the concept of *overlap multiplicity*$$\nu (\textbf {m})$$ [237], which is given by the number of $$1$$ entries in the vector $$\textbf {m}$$. One can then average multiplex quantities defined for $$\textbf {m}$$ over all $$\textbf {m}$$ for which $$\nu (\textbf {m})=\nu $$ (e.g. for degree, one would define $$k_{u}(\nu)=\binom {b}{\nu }^{-1}\sum _{\nu (\textbf {m})=\nu } k_{u}^\textbf {m}$$). For a given quantity, this entails taking $$b +1$$ different averages.

#### Walks, paths and distances.

Walks and paths—and their lengths—are important concepts in both graph theory and network science. The ability to define generalizations of such concepts for multilayer networks yields natural extensions of many other measures for monoplex networks, including graph distance, connected components, betweenness centralities, random walks, communicability [238] and clustering coefficients. One can then use some of these concepts, such as betweenness centralities and random walks, to define additional tools, such as methods for community detection or centrality measures. (See Ref. [239] for an example of such an approach.) Furthermore, one can use the notion of a shortest path (i.e. a ‘geodesic path’) to develop new measures that are intrinsic to multiplex networks. One example of this is *interdependence*, which is defined as the ratio of the number of shortest paths that traverse more than one layer to the total number of shortest paths [240,72].

When defining a *walk* on a multilayer network, one should be able to answer at least two basic questions:

Is changing layers considered to be a step in the multilayer network? In other words, are different copies of the same node in different layers considered to be a set of distinct objects such that it ‘costs’ something to change layers [137]?

Is there a difference between taking intra-layer steps in different layers?

In many of the existing notions of multilayer-network walks, these decisions have been implicit, and they have depended on the types of systems that were considered. For example, the costs of inter-layer steps are often very small in social networks (though this is not always the case), but they might be substantial in transportation networks [137]. Thus far, it has been rare to quantify inter-layer costs in studies of empirical data (see Section 3), and whether one should explicitly consider the costs of inter-layer steps depends on their values relative to other scales in a system. Clearly, it is also desirable to try to find ways to estimate such values from data even if one cannot obtain them directly.

If the answer to the first of the above questions is ‘yes’, then a step and a walk are each defined as occurring between a pair of node-layer tuples. This approach has been used to generalize concepts like random walks [69] (see Section 4.6.3), clustering coefficients [137] (see Section 4.2.3), several centrality measures [239] (see Section 4.2.4) and communicability [140]. When using this approach, it is often natural to generalize concepts from monoplex network by simply replacing nodes with node-layer tuples. For example, one can calculate centrality measures and other node diagnostics for each node-layer tuple separately and then obtain a single value for a node via some method of aggregation (e.g. by summing the values). Additionally, if there is a cost when changing layers, then one needs to ask what values they take. In some types of networks, such as transportation networks, it is possible to estimate costs of inter-layer steps using empirical data in a straightforward manner (see Section 3). In other situations, it can be difficult to determine reasonable values for the costs of changing layers.

*Labelled walks* [241] (i.e. *compound relations* [35]) are walks in a multiplex network in which each walk is associated with a sequence of layer labels. In such a situation, one can define a *walk length* that takes intra-layer steps into account in at least three different ways: each intra-layer step is of equal length (i.e. it does not depend on the layer), step lengths in different layers are comparable but can be weighted differently in different layers, or step lengths in different layers are incomparable. Reference [242] considered the last alternative and defined the length of a walk as a vector that counts how many steps are taken in each of the layers. They defined a path to be *Pareto efficient* if there are no paths that are better in at least one of the vector elements and are equally good in all of the other elements. The *Pareto distance* between two nodes is then given by a set of distance vectors (rather than a scalar value) corresponding to Pareto-efficient paths.

As discussed in Section 4.1, a straightforward way of generalizing any network diagnostic is to first aggregate a multiplex network to obtain a monoplex network and then apply tools that have already been developed for weighted networks. This approach was taken by Bródka *et al.* [91], who formulated several ways of aggregating a network and only kept edges whose aggregated weight was above some threshold. They then defined an aggregated distance for each edge to calculate shortest paths in an aggregated network.

There are natural ways to define walks, paths and path lengths in other types of multilayer networks besides multiplex networks. For example, Sahneh *et al.* [119] decomposed all walks in an interconnected network (i.e. a node-coloured network) into classes $$(l_1,\dots ,l_N)$$: there are first $$l_1$$ intra-layer steps, then one inter-layer step, then $$l_2$$ intra-layer steps and so on. They then defined the lengths of a walk to be the sum of the total number of intra- and inter-layer steps. Similarly, Sun *et al.* [131] defined ‘metapaths’ (and one can similarly define ‘metawalks’) of multilayer networks that have labels on both nodes and edges as sequences of node labels and edge labels. (See Section 2.8 for a discussion of the ‘heterogeneous information networks’ that were examined by publications such as [131].) An ‘instance’ of a metawalk is then a walk in a network that respects those labelling.

#### Clustering coefficients, transitivity and triangles.

A local clustering coefficient [7,1] measures transitivity in a monoplex network. One way to define it is by using the fraction of existing adjacencies versus all possible adjacencies in (i.e. the *density* of) the neighbourhood of a node. As discussed in Section 4.2.1, the concept of neighbourhood and the existence of pairwise connections (i.e. pairwise adjacencies) both become ambiguous in multilayer networks, as there are multiple ways to define these ideas in such settings. A second way to define a clustering coefficient is by using the ratio of closed triples (i.e. triangles) to connected triples. However, triangles cannot be defined in a unique way in multilayer networks. Indeed, as discussed in Ref. [105], there are many possible triples of nodes that contain three nodes and two layers. A third alternative is to define a monoplex clustering coefficient starting from the idea of three-cycles (i.e. closed paths of length 3). There are again multiple ways to define walks and paths in multiplex networks (see Section 4.2.2), so using this perspective also requires care. Consequently, attempting to define a multilayer clustering coefficient runs into even more significant complications than trying to define weighted [243] or directed [244] clustering coefficients.

Despite the clear difficulties in defining clustering coefficients for multilayer networks (especially if one also considers directions and/or weights), there have been several attempts [90,112,92,97,137,75,129,213,245] to define a notion of multilayer clustering coefficients (and, more generally, to develop notions of transitivity for multilayer networks). Most of these definitions are for multiplex networks. The definitions in Refs. [90,92] are not based on the standard definition of the local clustering coefficient [7] for unweighted monoplex networks, and the definition in Ref. [97] is not normalized between $$0$$ and $$1$$. Criado *et al.* [112] developed generalizations using the density-based perspective of a monoplex local clustering coefficient, and they defined a pair of local clustering coefficients starting from two alternative ways of defining a neighbourhood of a node in a multiplex network. Cozzo *et al.* [137] formulated several definitions for a multiplex clustering coefficient based on different ways of defining a walk and a three-cycle in a multiplex network. They also compared the properties of their new notions and several existing notions. As discussed in Ref. [137], multiplexity induces novel notions of transitivity—because, for example, one can close a triangle either on the layer in which a length-3 path starts or on a different layer—and different notions of transitivity might be more appropriate in different situations (e.g. in social versus transportation networks). Notions such as ‘structural holes’ [246] that one can relate to local clustering coefficients [247,1] also need to be generalized to account for multilayer settings.

There have been a few attempts to define clustering coefficients for node-coloured graphs (and equivalent structures). Parshani *et al.* [213] defined an ‘inter-clustering coefficient’ for interdependent networks in which each node has an inter-layer degree of $$1$$. Reference [245] defined a ‘cross-clustering coefficient’ for arbitrary two-layer interdependent networks (i.e. for node-coloured networks in which all edges are allowed).

#### Centrality measures.

In the study of networks, a ‘centrality’ measure attempts to measure the importance of a node, an edge or some other subgraph [2,1]. It is desirable to calculate centralities in multilayer networks [248], and several monoplex-network centrality measures have now been generalized to multiplex networks. For example, one can generalize PageRank (i.e. PageRank centrality) [249], which is based on the stationary distribution of a random walker on a network (with ‘teleportation’ [250], if the network is not strongly connected) by defining a random-walk process on a multilayer network. PageRank has been generalized for undirected multiplex networks so that a random walker can either take steps inside a layer or change layers, and both nodes and layers receive a rank [89]. Halu *et al.* [205] defined an alternative multiplex version of PageRank: random walks on one network layer are biased so that they take into account the normal PageRank values of some other layer. PageRank has also been generalized to node-coloured networks with two distinct layers by considering a random walk in which intra- and inter-layer steps have different probabilities [49].

Another popular centrality measure that has been generalized for multiplex networks is hyperlink-induced topic search (HITS) [251], which produces both hub and authority scores for the nodes. The HITS algorithm can be expressed as a SVD of the adjacency matrix of a network. This approach was utilized by Kolda *et al.* [47,46], who developed a modified version of the HITS algorithm based on the tensor-decomposition method PARAFAC (which is a generalization of the matrix SVD). Li *et al.* [88] considered a random walker in directed multiplex networks that yields hub and authority scores for nodes via an algorithm that is similar to HITS. Myers *et al.* [252] have also developed a multilayer generalization of hub and authority centrality.

Defining a random-walk process on a multilayer network yields several other centrality measures in addition to PageRank and HITS. De Domenico *et al.* [239] developed generalizations of PageRank, random-walk occupation centrality, random-walk betweenness centrality and random-walk closeness centrality based on a random walk process for a general single-aspect multilayer network. In the same paper, they also used the tensorial framework for multilayer networks developed in Ref. [67] to define generalizations of eigenvector centrality, Katz centrality and HITS centrality. Their methods yield a centrality value for each node-layer tuple, and one can then calculate aggregate centralities for each node by summing (or aggregating in some other way) the node-layer centralities from each layer.

As we discussed in Section 4.1, multiplex-network projections and aggregations make it possible to apply any monoplex methods to the resulting monoplex network. Solá *et al.* [83] took this approach and defined several generalizations of eigenvector centrality. Some of their generalizations yield centrality values for node-layer tuples, whereas others only give aggregate centrality values for nodes. Solá *et al.* defined yet another eigenvector-centrality measure by constructing an $$n b \times n b$$ matrix based on the intra-layer adjacency matrices and then computing its eigenvectors. (Recall that $$n$$ is the number of nodes in each layer and that $$b$$ is the number of layers.)

Other multiplex centrality measures that have been developed include multiplex versions of geodesic node betweenness centrality [204,239] and geodesic closeness centrality [239], which can each be calculated for any definition of shortest path (see Section 4.2.2), and the centrality measure defined by Coscia *et al.* [99], which ranks individuals in social networks based on how close they are to other people with some weighted ‘skill’ set. Aguirre *et al.* [253] studied how different inter-layer connection strategies in node-coloured networks (i.e. networks of networks) affect the ratios of mean centralities of node-layer tuples from two different layers. Very recently, communicability was also generalized for multiplex networks [140].

#### Inter-layer diagnostics.

Thus far, we have almost exclusively discussed generalizations of monoplex-network quantities for multilayer networks. Indeed, most methods and diagnostics that have been developed thus far for multilayer networks are direct generalizations of monoplex quantities. However, we expect that the new ‘degrees of freedom’ (via inter-layer relationships) that result from the introduction of multiple layers will yield interesting methods and diagnostics for multilayer networks that do not have counterparts in monoplex networks. Although many papers have already illustrated various fascinating insights that arise from the ‘new physics’ of multilayer networks, it is also crucial to develop fundamentally new tools and techniques that take advantage of the structure of such networks.

One way to develop new quantities for multiplex networks is to compare the intra-layer networks of two or more layers. For example, one can compare two layers to each other by counting the number of edges that they share—this quantity was called *global overlap* in Ref. [73], and a similar quantity called the *global inter-clustering coefficient* was defined in Ref. [213]—or by calculating the correlation between (weighted) adjacency-matrix elements of a pair of layers [84]. One can calculate the *degree of multiplexity* for a multiplex network by counting the number of node pairs that have multiple edge types between them divided by the total number of adjacent node pairs [254]. (One can analogously calculate a node's degree of multiplexity by considering all pairs of a node and its neighbours.) One can also calculate correlations (e.g. via assortativity coefficients) between node diagnostics such as degree or local clustering coefficient [84,72,213,239,255,256]. We discuss models of multiplex networks with built-in inter-layer correlations in Section 4.3.

Apart from examinations of correlations of network structures between layers, such as the ones that we just discussed, there has only been a little bit of work on genuinely multilayer diagnostics. One example of an inherently multilayer diagnostic is the *interdependence* [240,72], which is the ratio of shortest paths in which two or more layers are used to the total number of shortest paths (see Section 4.2.2). Moreover, most existing multilayer diagnostics have been designed for node-aligned multiplex networks (i.e. networks in which all nodes exist on every layer), though there are a few exceptions. For example, Ref. [146] defined the *multiplexity degree* of a node as the number of layers in which the node exists, and Ref. [202] studied online social networks that are not node-aligned by comparing nodes with a multiplexity degree of 1 to so-called ‘bridge’ nodes, which have a multiplexity degree larger than $$1$$.

If one construes different communities in a network as belonging to different layers of a layer-disjoint multilayer network (e.g. interconnected networks, node-coloured networks etc.) [149], then one can interpret quantities like assortativity and modularity as inter-layer diagnostics.

### Models of multiplex networks

A straightforward way to construct generative ensembles of synthetic multiplex networks is to consider any monoplex network model, such as an Erdos–Rényi (ER) random-graph ensemble [257] or the configuration model [258,259,260], and use it to construct intra-layer networks that are independent of each other. One can then incorporate dependencies between layers by, for example, considering nodes that have joint degree distributions [107,78,79] or by directly adding an arbitrary number of shared edges across layers [107,108]. One can also add inter-layer correlations by starting from a multiplex network in which intra-layer networks are generated independently of each other and then changing the node identities in one of the layers (i.e. by relabelling the nodes in that layer) in order to introduce inter-layer correlations [239]. Reference [256] also developed generative models for multiplex networks with correlations between layers.

Some ERGMs [261–263] can handle multilayer networks such as multilevel networks [110] (see Section 2.8) and multiplex networks [31,264,32]. For example, one can write the probability of any multiplex network $$G_M$$ to occur in an ERGM using the exponential form $$P(G_M)=\exp \{{\boldsymbol \theta } \cdot \textbf {f}(G_M)\}/Z({\boldsymbol \theta })$$, where $$\textbf {f}(G_M)$$ is a vector of network diagnostics (e.g. the number of triangles that contain different types of edges), $${\boldsymbol \theta }$$ is a vector that represents the model parameters and $$Z({\boldsymbol \theta })$$ is a normalization function. One application of ERGMs to multiplex networks is in a recent study of the influence and reputation of interest groups [265].

Bianconi [73,237] introduced microcanonical and canonical network ensembles [266,267] for multiplex networks. Microcanonical ensembles contain only networks that satisfy some strict set of constraints, whereas canonical network ensembles are based on maximizing Shannon entropy conditional on satisfying constraints only on average [268]. Deriving a canonical network ensemble thereby leads to exponential random graphs in single-layer networks [266]. Such ensembles were used recently to model interacting (i.e. node-coloured) spatial networks and multiplex spatial networks. (A *spatial network* is a network that is embedded in some space [269].) The latter were reported to have higher values of edge overlap (see Section 4.2.5) than multiplex networks whose component layers are independent of each other (i.e. for which the intra-layer edges are independent across layers) [212]. In their examples, the authors of Ref. [212] embedded all nodes into the same space (independent of the layer).

It is also useful to define other types of generative models for multiplex networks. For example, one can generalize monoplex-network attachment mechanisms such as preferential attachment [270,5] to multiplex networks. Criado *et al.* [112] defined a set of models in which a multiplex network grows by adding new layers such that only a random subset of nodes participates in each layer. References [255,72,200] examined multiplex-network models in which networks grow by addition of new nodes and new edges are created via preferential attachment. In these models, the adjacencies between nodes are determined by applying a preferential-attachment rule such that the probability of a new node in a layer to form an intra-layer adjacency to any other node is proportional to a function of the intra-layer degrees of the node in all of the layers. (Such a function is sometimes called an ‘attachment kernel’.) Both Nicosia *et al.* [72] and Kim and Goh [255] studied two-layer multiplex networks and attachment rules in which the attachment kernels were affine.^{25} Additionally, Nicosia *et al.* [72] allowed a node to be created at different times in different layers. Subsequently, Nicosia *et al.* considered the growth of multiplex networks via preferential attachment with nonlinear attachment kernels [271]. In the model introduced in Magnani *et al.* [200], each intra-layer network is updated via an affine preferential attachment rule [272] that only considers intra-layer node degrees from that layer or by copying edges from some other layer. All of these studies [255,72,200] found that their models can lead to positive correlations in intra-layer degrees across layers. Obviously, one can also construct multilayer versions of any other attachment mechanism.

Other forms of modelling are also important in the study of multiplex networks. In a social network, for example, one can posit that similarities (and dissimilarities) between different entities occur in multiple different ‘categories’ in a latent social space [273]. An observed social network is then an imperfect reflection of such ties.

### Models of interconnected networks

One can also generalize generative models for monoplex networks for other multilayer settings, such as interconnected networks. Such models are very useful for studies of dynamical processes that occur on top of multilayer networks (see Section 4.6), because they can facilitate the derivation of (approximate) analytical results for important qualitative features of the dynamics. There is now a large body of research on network models from several flavours of multilayer networks that are very similar to each other (but which use different terminology for the layers). Terminology for an individual layer includes ‘interacting network’ [56], ‘node colour’ [274–276], ‘node type’[277,64,115] and ‘module’ [278,279]. One can also generate models of interconnected networks using notions such as blockmodels [38,280] and mixture models [281,282]. One can also generalize the BA model [5], which connects nodes to each other via a preferential-attachment rule [270], for interdependent networks [245,283]. There has also been some research on temporally co-evolving interconnected networks [284,285].

A trivial way to construct interconnected (i.e. node-coloured) networks is to use a monoplex-network model to generate intra-layer networks and to then add inter-layer edges uniformly at random in order to connect nodes from different layers. For example, this approach was taken to connect (via inter-layer edges) sets of regular lattices [286,123,287], networks produced by ER random graphs [288,289], configuration-model networks [278] and Barabási–Albert (BA) networks [290]. In previous studies of interdependent-network models, it has been common to connect a pair of networks such that the resulting inter-layer network is undirected and the degree of each node in it is either $$1$$ or $$0$$. See Section 4.6.2 for details and Section 2.5 and Refs. [151,124,152] for discussions of how one can also construe such models as models of multiplex networks. Importantly, note that inter-layer edges need not be added uniformly at random, as one can use other random (or deterministic) process to connect networks to each other. For example, Ref. [253] studied how different inter-layer connection strategies affect the ratios of mean centralities of node-layer tuples from two different layers, and Ref. [291] examined how they affect SIR spreading (see Section 4.6.3).

A natural extension of the monoplex configuration model to interconnected (i.e. node-coloured) networks is to specify multiple degree distributions using multiple variables. As in the usual configuration model, the ends of edges (i.e. the stubs) are then connected to each other uniformly at random. References [56,115] defined a model in which $$P_{\alpha }(k_{1}, \dots , k_{b})$$ is the probability that a node on layer $$\alpha $$ has exactly $$k_{\beta }$$ neighbours in layer $$\beta $$. Söderberg [274–276] started from a multi-degree distribution $$P(k_1, \dots , k_b)$$ that is independent of the layer and a mixing matrix $${\boldsymbol \tau }$$ with elements $$\tau _{\alpha \beta }$$ that gives the relative abundances of edges between layers $$\alpha $$ and $$\beta $$. Similarly, Newman [64] defined a degree distribution $$P_{\alpha }(k)$$ for each layer and a mixing matrix similar to that in Refs. [274–276]. Gleeson [292] examined a connection-probability matrix $$\textbf {P}$$ whose elements $$P_{\alpha \beta }(k)$$ specify the probability that a node on layer $$\alpha $$ has $$k$$ edges to nodes on layer $$\beta $$. Additionally, see Ref. [293] for an ER model for node-coloured graphs (i.e. one connects a pair of nodes with colours $$\alpha $$ and $$\beta $$ with probability $$p_{\alpha \beta }$$) and Ref. [179] for the definition of a configuration model for node-coloured bipartite graphs.

One can also generate a synthetic ensemble of interconnected networks that incorporates both intra- and inter-layer degree-degree correlations by defining a model that is specified by the probability $$P_{\alpha \beta }(k,k')$$ of selecting an edge that connects a node of degree $$k$$ in layer $$\alpha $$ to a node with degree $$k'$$ in layer $$\beta $$ [149]. Similarly, one can define a somewhat more restrictive model whose input is the probability $$P_{\alpha \beta }(k_{\alpha \alpha },k_{\alpha \beta },k'_{\beta \beta },k'_{\alpha \beta })$$ of the existence of an edge between nodes in layers $$\alpha $$ and $$\beta $$ such that the node in layer $$\alpha $$ is adjacent to $$k_{\alpha \alpha }$$ nodes in layer $$\alpha $$ and $$k_{\alpha \beta }$$ nodes in layer $$\beta $$, and a node in layer $$\beta $$ is adjacent to $$k'_{\beta \beta }$$ nodes in layer $$\beta $$ and $$k'_{\alpha \beta }$$ nodes in layer $$\alpha $$ [118]. Random-graph ensembles that contain correlations like the ones that we have just described have already been useful for incorporating disease and population structures into investigations of biological epidemics on networks [224].

### Communities and other mesoscale structures

Community detection is one of the most popular topics in network science [9,10]. Roughly, the idea of community detection is to (algorithmically) find sets of nodes that are connected more densely to each other than they are to the rest of a network. Even in monoplex networks, there is no universally accepted definition of what constitutes a community (nor is it appropriate for there to be such a universal definition). Instead, there are numerous notions of a network ‘community’—most of which are not actually definitions in a mathematical sense—and associated computational heuristics to find them. This situation even messier in multilayer networks, for which there exist multiple viable generalizations even of concepts as simple as degree.

The earliest attempts to find structures similar to communities—as well as more general types of mesoscale features—in a multilayer network comes from the social-networks literature, where *blockmodels* were used to find sets of nodes that all share similar connection patterns [2,38,294,86,295]. Blockmodelling, which includes both deterministic and stochastic varieties, differs from community detection in that blockmodels do not necessarily seek densely connected sets of nodes with sparse connections between sets but instead allow (in principle) any kind of connectivity pattern as long as most (or all) of the nodes in a block share a similar pattern. One can also find mesoscale network features by trying to assign roles (i.e. colours or labels) to nodes in a network [150,228,273]. One can also use multilayer ideas in the formulation of a stochastic blockmodel [296] or start with a monoplex network and try to assign layers to construct a multilayer-network representation [226,227].

#### Community structure in multilayer networks.

Despite the plethora of research on community structure in monoplex networks, the development of community-detection methods for multilayer networks is in its infancy. (Additionally, much more work is needed on mesoscale features other than community structure in both single-layer and multilayer networks.) Thus far, only a few community-detection methods have been generalized for multilayer networks. One of the best-known of these generalizations was formulated by Mucha *et al.* [66,101], who generalized the objective function known as ‘modularity’ [297] to the ‘multislice’ version of multilayer networks so that each node-layer tuple is assigned separately to a community. This makes it possible for the same entity to be in separate communities depending on the layer. Maximizing modularity is a computationally hard problem, and computational difficulties become even more acute in multilayer networks, as the system size now scales not only in the number of nodes but also in the number of layers. Higher-order tensors become large much faster than matrices as the number of nodes per layer increases, so computational tractability becomes increasingly challenging as one considers multilayer networks with a larger number of aspects. This challenge is especially prominent when representing a temporal network as a multilayer network, because the number of layers can be very large in such situations. To help address the computational challenges of optimizing modularity in multilayer networks, Ref. [102] developed a version of the multislice modularity-optimization method in Ref. [298] that reduces the size of a network by grouping nodes such that the reduced network does not change the optimal partition and the modularity values of all of the partitions of the reduced network are the same as the corresponding partitions in the original network. One can then apply modularity optimization to a smaller multilayer network and thereby consider larger multilayer networks than would otherwise be possible.

Another challenge in optimizing multislice modularity is the construction of appropriate ‘null models’ for multilayer networks. When optimizing modularity in monoplex networks, the choice of null model quantifies what it means for a network to have sets of nodes (i.e. communities) that are connected to each other more densely than what would be expected ‘at random’. The null model is typically given by a random-graph ensemble, which specifies what it means for connections to have arisen by chance. Naturally, a larger variety of null models are possible for multilayer networks than for monoplex networks, and the choice of null model has a large effect on the results of community detection. Reference [103] developed several new ‘null-model’ network ensembles for multislice modularity, though there remains much more work to do in this direction. Optimization of multislice modularity has been used in multislice representations of temporal networks to study phenomena such as political-party reorganization [66,101], the dynamics of behavioural [299] and functional brain networks [300,301], as well as the qualitative behaviour of time-series output of coupled nonlinear oscillators [103,302]. Multislice modularity has also been used to represent ‘Kantian fractionalization’ in a multiplex network of countries, as there is a positive correlation between the value of multislice modularity and the future conflict rate between countries [303]. It is possible to use the framework of multislice networks from Refs. [66,101] to generalize community-detection methods other than modularity to multilayer networks. Reference [304] took this approach to examine *group synchronization* (which has nothing to do with the concept of ‘synchronization’ from dynamical systems), in which one tries to determine the elements of a (mathematical) group from noisy measurements of their pairwise ratios.^{26}

Another important (and old [9]) method for community detection in monoplex networks is spectral clustering. Michoel and Nachtergaele [164] generalized spectral clustering and the Perron–Frobenius theorem to hypergraphs, and they applied their method to multiplex networks by mapping them to hypergraphs (see Section 2.6). Li *et al.* [144] extended the framework of identifying ‘heavy subgraphs’ (i.e. induced subgraphs of weighted networks with high values for the sum of the edge weights) for weighted multiplex networks by defining a *recurrent heavy subgraph* as a multiplex subnetwork that is spanned by a subset of nodes and a subset of layers in a multiplex network such that the sum of edge weights in that subgraph is large. For computational reasons, Li *et al.* allowed fuzzy memberships of nodes and layers in such subgraphs.

One way to seek multiplex communities is to take advantage of monoplex community-detection methods and start by separately detecting communities in each intra-layer network. Barigozzi *et al.* [85] took this approach and started by detecting communities in each layer via modularity optimization. They examined an international trade network in which each layer corresponds to a different product category. They compared their results with communities that they found in an aggregated version of their multiplex network, and they observed considerable variation in the intra-layer communities across category layers. It thus seems that much of the information about multiplex communities can be lost by aggregating a multiplex network, which underscores the importance of developing techniques for studying such networks without having to aggregate them into a monoplex network. Berlingerio *et al.* [94] also started by detecting intra-layer communities, and they defined a multiplex community as a ‘closed frequent itemset’^{27} in which each node represents a single transaction and the items are tuples $$(c,\alpha)$$, where $$c$$ is the community to which a node is assigned in layer $$\alpha $$. In network terms, they defined a multiplex community as a set of intra-layer communities such that at least some predefined number of nodes (the so-called ‘support value’ of the community) are shared in all of those intra-layer communities. A multiplex community is only considered to be ‘valid’ when there does not exist another multiplex community with the same support value that is a superset of the first community.

Obviously, one can exploit existing methods for monoplex community detection to find communities after aggregating multiplex networks into a single monoplex network [207]. Alternatively, one can examine community structure in multiplex networks by employing a network-aggregation process that considers all possible $$2^{b}$$ combinations of layers [204]. Tang *et al.* [96] identified two additional ways of using existing community-detection methods to study multiplex communities. These methods interpolate between detecting communities in an aggregated network (‘network integration’) and detecting them in individual intra-layer networks (‘partition integration’). For example, Tang *et al.* defined ‘utility integration’ as a method for calculating ‘utility matrices’ of a community-detection method for each layer separately. They then optimized an objective function for a multilayer utility matrix that they obtained by summing all of the individual-layer utility matrices. Note that the definition of a single-layer utility matrix depends on the employed community-detection method. For modularity, for example, the utility matrix is the modularity matrix $$\textbf {Q}$$ [307] of an intra-layer network. One then obtains the aggregate modularity value $$Q$$ as the sum $$Q=\sum _{ij}{Q}_{ij}$$.

One can also apply ‘inverse community detection’ to multiplex networks [50]. The basic idea behind inverse community detection is that one is given a ground-truth community structure for a multiplex-network, and one then seeks an optimal linear combination of the weights $$\textbf {m}$$ to use when aggregating the layers (see Section 4.1) so that the community structure in the aggregated network corresponds as closely as possible to the ground-truth communities. Rocklin *et al.* [113] formulated more complicated ways for defining the quality of network-aggregation weights when a ground-truth community structure is available. They also defined a ‘metaclustering’ approach in which they first produced multiple different weighted networks with random aggregation weights, then clustered the resulting weighted networks, and finally calculated (using variation of information) a matrix of distances between the pairs of different clusterings. They subsequently used a hierarchical-clustering method [9] to obtain communities from this distance matrix.

Naturally, there are numerous other clustering techniques that can also be useful for examining communities in multilayer networks. For example, Ströele *et al.* have taken various approaches from a data-mining perspective [87,308,309] for clustering a multi-relational data set of scientific collaborations in Brazil.

#### Methods based on tensor decomposition.

Similar to using the SVD of an adjacency matrix to find communities in a monoplex network, one can employ tensor-decomposition methods [43] to detect communities in a multiplex network [47,42,209]. (Tensor decomposition and analysis of multiway data has a very long history [310,311].) Several different tensor-decomposition methods amount to generalizations of the SVD to tensors. CANDECOM/PARAFAC (CP) is among the ones that can be used for multiplex community detection. It decomposes a tensor into $$R$$ factors, such that the tensor elements are approximated as $$\mathcal {A}_{uv\alpha } \approx \sum _r^R x_{ur}y_{vr}z_{\alpha r}$$, where $$\textbf {x},\textbf {y} \in \mathbb {R}^{n \times R}$$ and $$\textbf {z} \in \mathbb {R}^{b \times R}$$. Each factor corresponds to a community, the nodes with the highest values in the columns of $$\textbf {x}$$ and $$\textbf {y}$$ correspond to nodes in the community, and the layers with the highest values in the columns of $$\textbf {z}$$ correspond to layers in the community [47,42].

Other tensor decomposition methods, such as three-way DEDICOM or a Tucker decomposition, can also be used to detect communities in multiplex networks [206,208]. The three-way DEDICOM is similar to a blockmodelling approach because it finds classes of nodes that have similar connection patterns to other classes. (Importantly, nodes within a class may or may not be connected densely to each other.) Gauvin *et al.* examined community structure in third-order tensors using non-negative tensor factorization [312]. Liu *et al.* used tensor-decomposition methods for so-called ‘multiview partitioning’ (in which one clusters a set of objects that contain features from multiple information sources and/or have multiple feature representations) [313]. One can then apply tensor-based clustering methods that have been developed for hypergraphs [143] to study communities in multiplex networks.

### Dynamical systems on multilayer networks

One of the main reasons to study dynamical systems on networks is to attempt to improve understanding of how nontrivial connectivity affects dynamical processes on networks (and conversely how dynamical processes affect network structure). This can be very complicated, and there are interesting new wrinkles when studying dynamics on multilayer networks. It is very important to develop a deep understanding of such dynamics as well as how to design control strategies to achieve desired outcomes.

It has been illustrated repeatedly that multilayer variants of dynamical processes that have been studied on monoplex networks exhibit behaviour that cannot be explained by examining one intra-layer network at a time or by aggregating layers. Completely new phenomena can occur, and one of the primary challenges in the study of multilayer networks is to discern how features like multiplexity affect dynamical processes. Most of the multilayer dynamical processes that have been studied thus far have been examined using familiar mathematical machinery, such as generating functions and spectral theory (and such methods are, of course, subject to similar limitations as for monoplex networks), though we expect that new methods that incorporate more ideas from tensor algebra (and from geometry) will be necessary to obtain a complete understanding of dynamical systems on multilayer networks.

#### Connected components and percolation.

In an undirected monoplex network, a *connected component* is a maximal set of nodes that are all connected to one another via some path. One can use the same definition for multilayer networks by allowing paths that include any of the possible types of edges. It is possible to use generating functions to characterize the component-size distribution for the monoplex configuration model via a mean-field approximation [1,314–316].^{28} One can take a similar approach for a configuration model defined on node-coloured graphs, for which one also specifies a joint degree distribution that indicates the number of edges that connect nodes of different colours [64,65,115,274–276,317]. (Additionally, see Section 4.4 and Ref. [293,179] for a discussion of node-coloured bipartite graphs and their unipartite projections.) Therefore, one can also calculate the component-size distribution for equivalent structures such as interacting networks [56].

Percolation processes are among the simplest types of dynamics that can occur in a monoplex network [1], so it is natural that they have already been investigated extensively in multilayer networks. In *site percolation* (i.e. *node percolation*), one lets each node of a network be either occupied or unoccupied, and one construes occupied nodes as ‘operational’ and unoccupied nodes as ‘nonfunctional’. In *bond percolation* (i.e. *edge percolation*), it is instead the edges that are either occupied (i.e. operational) or unoccupied (i.e. nonfunctional). As with the notion of a connected component, it is straightforward to generalize concepts from percolation theory—such as the emergence (or destruction) of a giant connected component (GCC) as a function of the number of occupied nodes or edges—to a multilayer network framework. It is also important to develop good algorithms for computing the largest connected component in a multilayer network [318]. Note that it is typical to formulate percolation processes such that nodes or edges are removed from the network instead of labelling them as unoccupied. It is also often convenient to use a network diagnostic (e.g. mean degree) as a control parameter instead of the fraction of occupied nodes or edges.^{29} This perspective is very useful for percolation processes in multilayer networks.

Many scholars have now studied percolation processes on multiplex networks. For example, Refs. [319,78] studied percolation in node-aligned multiplex networks that consist of two layers of ER networks whose nodes have intra-layer degrees that are correlated across the layers. They found that negative correlations in degrees increase the percolation threshold, whereas positive correlations decrease it. Moreover, for maximally positive correlations of intra-layer degrees—i.e. when each node has exactly the same intra-layer degree—the network contains a GCC for any nonzero edge density [78]. It was later demonstrated that multiplex networks with positively correlated intra-layer degrees are also more robust with respect to ‘biconnectivity’ (i.e. by considering two nodes to be in the same component if and only if there are at least two paths between them), but that multiplex networks with negatively correlated intra-layer degrees are more resilient against attacks that are targeted based on the total degree (i.e. the sum of intra-layer degrees) of nodes [79]. In this traditional type of targeted attack, nodes are removed in order from highest degree to lowest degree, though one can of course also target nodes based on other features. Guha *et al.* [320] considered percolation on multiplex networks in which all of the layers are subgraphs of some underlying network. They determined each subgraph by selecting nodes uniformly at random from the underlying network.

#### Percolation cascades.

In the percolation processes on node-coloured networks that we discussed in Section 4.6.1, intra- and inter-layer edges are semantically equal, and a path that connects two nodes can include both types of edges. Buldyrev *et al.* [57] defined a cascade process in which intra-layer edges (called *connectivity edges*) are defined in the same way as for monoplex networks, but inter-layer edges (which are called *dependency edges*) encode dependencies between nodes. In Fig. 7(a–d), we illustrate their percolation process using a multilayer-network framework. In Fig. 7(e), we show a schematic that illustrates an intra-layer edge between a pair of nodes, which are adjacent (e.g. via dependency edges) to nodes from different components of another layer. Buldyrev *et al.* studied multilayer networks with two layers, arbitrary intra-layer degree distributions and inter-layer adjacencies that can exist between a node in one layer and its counterpart in the other layer. In their cascade process, one starts by removing a fraction $$1-p$$ (where $$p \in [0,1]$$) of the nodes uniformly at random [see panel (a)]. As we show in panel (b), one then divides the remaining nodes into disjoint sets according to their connected component in one specific layer (the proverbial ‘first’ layer). One then updates the intra-layer network of the other layer (the equally proverbial ‘second’ layer) by removing intra-layer edges between nodes that are adjacent to nodes from the first layer that are now in different components in that layer [see panel (c)] As we show in panel (d), the cascade then continues by removing intra-layer edges in the first layer that are between nodes that depend on nodes from different components in the second layer and by updating the components of the first layer accordingly. This process then continues—alternating between the two layers—and one divides the two networks into progressively smaller components until reaching a stationary state in which the nodes in connected components in each of the layers depend only on nodes that are in the same component in the other layer^{30} . Buldyrev *et al.* studied this process in a mean-field setting by applying a generating-function formalism recursively until they reached a steady state. They reported that the system undergoes a first-order phase transition with respect to the control parameter $$p$$. From a big-picture perspective, an important result of Ref. [57] is an illustration that interdependent networks with very heterogeneous intra-layer degree distributions can be less robust (for this type of percolation processes) than interdependent networks with more homogeneous distributions, which is the opposite of what has been observed for random failures (i.e. uniformly at random) using analogous network ensembles in monoplex networks [315,314,321,1]. It was later suggested [151] that the interdependencies between two networks might actually make the transition less steep than similar transitions in ordinary percolation for some networks, although the transition is discontinuous if the intra-layer networks are ER graphs. See Refs. [322,323] for further discussion and debates on the issue of steepness, which has been controversial. The cascade process that we described above has received considerable attention in the last few years [324]. For example, it has been extended to consider attacks on nodes that are targeted by degree (rather than the failures of nodes determined uniformly at random) [325,326,289].

The percolation mechanism that was studied in Ref. [57] has been generalized so that some fraction $$1-q$$ of the nodes in one layer are not dependent on any nodes (i.e. they are ‘independent’ nodes) in the other layer [120]. In this situation, it was reported that there exists a critical fraction of dependency edges between nodes from different layers (i.e. a critical fraction of ‘interdependent node pairs’), below which the percolation transition is second-order (i.e. continuous) and above which it is first-order (i.e. discontinuous) when the intra-layer networks are ER networks [120] (see Fig. 7(f)). If the intra-layer networks are produced from a configuration model with a power-law degree distribution, there is also a region of parameter space that exhibits interesting behaviour that has been described as a ‘hybrid’ transition [125]. In this regime, the size of the mutual GCC (which contains nodes from each layer) jumps discontinuously at a finite critical value of the percolation parameter $$p$$ to a very small but nonzero value and there is also a continuous transition in the size of the mutual GCC as $$p \rightarrow 0$$. Another situation in which percolation has been studied in interdependent networks includes a ‘support-dependence relationship’, in which nodes are incident to more than one inter-layer edge and a node is removed only when all of its inter-layer neighbours are nonfunctional (i.e. have been removed in the percolation process) [327]. References [328–332] considered a two-layer node-aligned multiplex network in which one layer has connectivity edges and the other has dependency edges. Additionally, Refs. [333,334] examined strategies for choosing which nodes should not be interdependent (i.e. which nodes should be ‘autonomous’) for a network to be robust to failures.

Reference [57] included analytical calculations for two-layer networks in which both intra-layer networks were either ER networks or configuration-model networks with a power-law degree distribution. They considered interdependent networks by placing inter-layer edges uniformly at random between the two layers. In the past few years, there has been a great deal of interest in studying interdependent networks using a variety of assumptions about the structure of the intra- and inter-layer networks. It has been reported that abandoning the assumption of placing inter-layer edges uniformly at random can make interdependent networks more robust to random (i.e. uniformly at random) failures and that the percolation phase transition can be first-order instead of second-order. This is the case, for example, if the intra-layer degrees of interdependent node pairs are either exactly the same [335] or positively correlated with each other [213], if numerous pairs of interdependent node pairs are adjacent to each other in both layers (so that there are many four-cycles that consist of alternating inter- and intra-layer edges) [213]^{31} , or if one includes a control parameter that creates some interdependent node pairs that are guaranteed to consist of nodes with high intra-layer degree (i.e. some pairs of interdependent nodes must include high-degree nodes from each layer) [336]. Reference [337] considered interdependent networks with degree correlations (i.e. degree assortativity) in intra-layer networks, Ref. [338] examined interdependent networks with both inter- and intra-layer degree correlations, and Refs. [339,340] studied percolation cascades in interdependent networks in which intra-layer networks are produced using monoplex random-graph models with clustering (in particular, by using the degree-triangle model [341]). One can also study percolation cascades using any number of networks as well as with arbitrary types of coupling between component networks. In such situations, a cascade is occurring on a so-called *network of networks* [126,153,288,289,342–344].

One of the motivations of the cascading-failure model in Ref. [57] was to try to help explain failures on spatially-embedded infrastructure networks. Consequently, in subsequent studies of percolation on interdependent networks, there have been several attempts to take spatial constraints into account in the choices of the intra-layer networks. For example, Li *et al.* [286] studied square lattices with a tunable parameter to describe distances between nodes in an interdependent pair. They reported that there is a critical distance, such that the transition is second-order below this distance and first-order above it. (They measured the distance based on the lattice structure that is used to determine the intra-layer adjacencies; it depends only on the node coordinate in a node-layer tuple.) References [123,345] extended the model by Li *et al.* by allowing nodes that are not interdependent. For example, Bashan *et al.* [123] examined two-layer networks in which each layer is a two-dimensional spatial network (i.e. a network that is embedded in two dimensions). They used equal-sized square lattices for their spatial networks and supposed for each layer that some fraction $$q >0$$ of the nodes (i.e. the same fraction for each layer) are dependent on a node that is selected uniformly at random from the other layer. They reported that this situation yields a first-order phase transition for a giant component in the network. This contrasts with the second-order transition that occurs when one considers analogous coupling when the layers are instead ER networks. Shekhtman *et al.* [346] examined percolation on a network of interdependent spatially-embedded networks. Reference [287] included the possibility for networks to ‘heal’ (by adding new intra-network edges) after each cascade step, and Ref. [347] studied two identical random regular graphs in which there is an upper limit to the intra-layer shortest path between a pair of interdependent nodes. Additionally, Berezin *et al.* [348] examined a targeted-attack percolation problem (which they called ‘localized attack’) in interdependent spatial networks. In their investigation, the initial node failures are localized geographically.

One can define a cascading-failure process in a multiplex network so that two nodes are considered to be in the same *mutually-connected component* (i.e. they form a so-called *viable cluster*) if there is an intra-layer path between them in all of the intra-layer networks [151,124,152]. This process is equivalent to a cascade process in an interdependent network in which nodes that are adjacent to each other via interdependency edges can be merged to create a single node in a multiplex network [124,152]. (This occurs, for example, when an interdependent network has two layers and each node has exactly one undirected inter-layer dependency edge.) One can also use a similar approach to map a cascading-failures process in an interdependent network in which only some fraction of nodes in each layer are dependent on nodes in the other layer to a cascading-failure process on a multiplex network [152]. Bianconi *et al.* [153] reported that one can achieve a similar reduction to finding a mutually-connected component on a multiplex network for cascading failures on a network of networks that is diagonal and layer-coupled (see Section 2.1) as long as, for every pair of layers, there exists a path of dependency edges between a node in one layer and a node in the other layer. Cellai *et al.* [74] studied the emergence of a *giant viable cluster* (i.e. a *mutually-connected giant component*) for multiplex networks with overlap, and Hu *et al.* [349] examined a similar phenomenon. (See Section 4.2.5 for a discussion of overlap between layers in multiplex networks.)

Parshani *et al.* defined a similar cascade process for multiplex networks in which nodes can be adjacent to each other via connectivity edges and/or dependency edges [329]. In their node-percolation process, nodes are labelled as unoccupied if they are not in the GCC that is formed by occupied nodes of the connectivity layer (i.e. a layer whose edges are connectivity edges). One then labels nodes as unoccupied if they are adjacent to an unoccupied node in the dependency layer (i.e. a layer whose edges are dependency edges). One then repeats these two processes sequentially until one reaches a stationary state. Parshani *et al.* placed a configuration-model network in the connectivity layer; in the dependency layer, they placed a network in which a fraction $$q$$ of the nodes have degree $$1$$ and the remainder of the nodes have degree $$0$$.^{32} Prior to starting the above sequential percolation process, they initially removed a fraction $$p$$ of the nodes uniformly at random. In their study, Parshani *et al.* derived a transcendental equation for the critical value $$q_{\mathrm {c}}$$ of the parameter $$q$$—which, analogously to the phase diagram in Fig. 7(f) for a similar process, has a corresponding critical value $$p_{\mathrm {c}}$$ of the parameter $$p$$—and they obtained a closed-form solution for $$q_{\rm c}$$ and $$p_{\mathrm {c}}$$ when they used an ER network in the connectivity layer. For the ER example, they showed that the phase transition for the GCC size as a function of $$p$$ is second-order for $$q < q_{\mathrm {c}}$$ but first-order for $$q >q_{\mathrm {c}}$$. One can altermatively fix $$p$$ and use $$q$$ as a control parameter. Reference [332] examined the same cascade process on a network in which one can adjust the overlap between dependency edges and connectivity edges. They reported that increasing the overlap fraction can reduce the vulnerability of a network to the removal of nodes uniformly at random. Recently, Hu *et al.* used a similar percolation-cascade model to examine conditions for viral influence on correlated multiplex networks [350].

Very recently, Min and Goh [351] examined a variant of the notion of a viable cluster using multiplex networks. In their setting, some fraction of the nodes are ‘source’ nodes. A node is ‘viable’ if, in each layer, there exists an intra-layer path of viable nodes that connect that node to a source node. The mean final fraction (over an ensemble of networks) of viable nodes is called the system's *viability*, which depends both on the network's initial state and on the updating procedure. Note that one specifies the ‘state’ of a network by specifying both its structure (i.e. its adjacencies) and the state of each of its nodes. One way to reach an stationary state of the network is via a ‘cascade of activations’: one first considers only source nodes as viable and then recursively sets unviable nodes to be viable if they meet the viability condition. One can reach an stationary state with a different value of the viability via a ‘cascade of deactivations’: one first considers all nodes as viable and then recursively sets nodes that do not satisfy the viability condition as unviable. The system exhibits hysteresis, which one can observe by plotting the viability $$V$$ versus a network diagnostic (such as mean intra-layer degree), so one might need to add more edges to the system than were previously removed to be able to recover from the random (e.g. uniformly at random) failure of edges. In the limit as the number of source nodes becomes $$0$$, the process studied in Ref. [351] reduces to the examination of mutually-connected giant components (i.e. the standard notion of viable clusters). Baxter *et al.* [352] independently studied processes that are akin to those in [351]. In fact, the cascade of activations that we described above amounts to the ‘weak bootstrap percolation’ that was examined in Ref. [352] in the context of multiplex networks. In weak bootstrap percolation, some fraction of nodes are initially set to be invulnerable, all invulnerable nodes are considered to be active and rest of the nodes are inactive. One then recursively sets inactive nodes to be active if they are adjacent to at least one active node in each layer. The ‘weak pruning percolation’ of Ref. [352] is similar to the cascade of deactivations, but it is not exactly the same.

#### Compartmental spreading models and diffusion.

It is traditional to study spreading processes using compartmental models on completely mixed populations [353,354]. In such a model, each compartment describes a state (e.g. ‘susceptible’, ‘infected’ or ‘recovered’), and there are parameters that represent transition rates for changing states. However, disease-spreading and information-spreading processes typically take place on networks, so it is important to study the effects of network structure on such processes [1,355]. One places a compartmental model on a network, so each node can be in one of several epidemics states (e.g. ‘susceptible’ or ‘infected’), and the nodes have either continuous or discrete update rules that govern how the states change. Clearly, it is also important to investigate diffusion and other spreading processes on multilayer networks [356].

The susceptible-infected-recovered (SIR) model^{33} is one of the simplest and most popular compartmental models, and its steady state can be related to a bond-percolation process [357,358]. (Note, however, that one needs to be careful about the precise connection between SIR dynamics and percolation [359,360].) For many large-scale spreading processes, one important step towards making models more realistic is to study them on a *metapopulation* structure instead of on a single network. In such a structure, each node is a completely mixed population that is adjacent to other populations via the edges in a network. Epidemic processes on networks of networks (i.e. interconnected networks, node-coloured networks etc.) are natural extensions of completely mixed metapopulation models [116,114]. Several authors have studied an SIR model on two-layer interconnected networks. For example, focal ideas have included the correlations of the intra-layer degrees of nodes that are adjacent across layers [291], the strength of coupling (i.e. the number of inter-layer edges) between the two networks [116], and the identification of influential spreaders by computing $$k$$-shells [361].

SIR models have also been studied on multiplex networks. For example, Min *et al.* [77] studied SIR spreading on various types of two-layer multiplex networks [78,79] with a layer-crossing *overhead*. Such an overhead is a cost for an infection (or information) to change layers (see Fig. 8(e)), and this idea is very important for multiplex networks (and other multilayer networks) in general [137]. References [78,79] illustrated that there are spreading processes on multiplex networks that cannot be reduced to such processes on aggregated networks. Shai *et al.* [362] demonstrated that an SIR model on positively degree-correlated multiplex networks with two layers can result in a lower epidemic threshold than on negatively degree-correlated networks—a similar result was later reported by Ref. [363]—but that the opposite can be true for a ‘constrained’ SIR model in which each node is only allowed to interact with a random subset of its neighbours. (For each time step, the nodes in this subset are determined uniformly at random, and the subsets at different time steps are independent of each other.) They consider networks in which each layer is an ER graph as well as ones in which each network is a BA network. SIR dynamics have also been studied on multiplex networks that are not node-aligned. For example, Yağan *et al.* [127] considered a network that contains one ‘physical’ layer that includes all nodes and other layers that contain only some subset of these nodes. Buono *et al.* [364] examined SIR dynamics on a network that contains two layers with an equal number of nodes, but only some fraction of the nodes exist in both layers.^{34}

Jo *et al.* [59] defined a variant of the standard SIR model on a two-layer multiplex network in which normal infection spreading occurs in one layer and information spreading occurs in the other layer. They allowed information spreading to immunize nodes, which could then proceed directly from the ‘S’ compartment to the ‘R’ compartment. Funk *et al.* [365] also investigated an SIR model that was coupled to information-spreading dynamics, and a subset of the same authors subsequently examined the interplay between ‘susceptible-infected-recovered-susceptible’ (SIRS) epidemics and information spread [366]. (See Ref. [367] for a review article on modelling the influence of human behaviour on the spread of diseases.) In their model, better-informed nodes have a reduced susceptibility. Funk and Jansen [107] studied the interplay of two pathogens that spread according to an SIR processes on two-layer networks that can have arbitrary inter-layer degree correlations and overlaps. In their model, nodes that have experienced the first epidemic (which occurs on the ‘first’ layer) have a reduced susceptibility to the second pathogen (which occurs on the ‘second’ layer). We illustrate the mechanism that they studied in Fig. 8(a–d). They illustrated that epidemic sizes are smaller in the second layer when there are positive inter-layer degree correlations than is the case if there are no such correlations or negative correlations. They also considered scenarios in which either epidemic can render the nodes immune to the other epidemic, but they always assumed that the epidemics are not taking place concurrently. Marceau *et al.* [108] built on this work and introduced an SIR model in which the two epidemics unfold simultaneously and can interact dynamically instead of only ‘sequentially’.

The susceptible-infected-susceptible (SIS) spreading model has been studied both on interconnected networks with a given joint degree distribution between layers [118] and for more general situations (e.g. in which the node sets are disjoint, but all inter- and intra-layer adjacencies are allowed) [119] by generalizing the result for monoplex networks [368]—though one needs to be very careful about how such ‘results’ have been stated in the literature [369]—that the epidemic threshold is given approximately by the inverse of the spectral radius of the contact network's adjacency matrix. Reference [147] also started from the expression that relates an adjacency matrix's spectral radius to the epidemic threshold of an SIS spreading model. However, they took a perturbative approach: they assumed that the spectral radii of the intra-layer and inter-layer supra-adjacency matrices are both known, and they studied the effect of network structure on the spreading process by deriving perturbation approximations for the spectral radius of the supra-adjacency matrix for the entire interconnected network (which we recall is equal to the sum of the intra-layer and inter-layer supra-adjacency matrices). An SIS spreading process was studied using a contact-contagion formulation by Cozzo *et al.* [82], who calculated an epidemic threshold that corresponds to the inverse of the spectral radius of the supra-matrix for the contact probabilities. All of the above studies of SIS models noted a decrease in the epidemic threshold upon the introduction of inter-layer adjacencies.

Several other authors have also examined SIS dynamics and generalized versions of such dynamics. For example, some studies have considered SIS models in which two competing infections spread on two layers of a multiplex network [109,370]. Sanz *et al.* [371] studied interacting epidemics using an extension of the SIS model in which the transitions from infection to recovery and from recovery to infection in one disease depend on the state of each node with respect to the other disease. Additionally, Ref. [372] used SIS dynamics to study malware spreading on a multiplex network in which a node can only be in a single state across all layers (i.e. the inter-layer spreading is infinitely fast) but the spreading process occurs independently (with different contagion rates) on each of the intra-layer networks.

Reference [373] studied interacting SIS dynamics on multiplex networks in which one layer has literal SIS dynamics on real social contacts and the other layer has SIS-like awareness dynamics (of information about the disease) on virtual social contacts. Similarly, Sahneh *et al.* [374] studied an SAIS model, which includes an additional ‘alert’ state, on a multiplex network in which one layer is responsible for the spreading of an infection and the other is responsible for information dissemination. They also developed a mean-field model that can handle an arbitrary number of node states and layers [375]. Lima *et al.* [376] considered various types of disease-spreading process that are coupled to information-spreading processes and incorporated human-mobility and mobile-phone data from a large set of individuals in the Ivory Coast. New investigations of interactions between (simple models of) epidemic processes and (simple models of) social processes are appearing very frequently [377,378,379]. Some scholars are also examining interactions between epidemic processes and network structure. For example, Shai *et al.* [380] studied a setting in which SIS dynamics are coupled to a process that rewires intra-layer edges between susceptible and infected nodes on an interconnected (i.e. node-coloured) network.

Gómez *et al.* [136] examined diffusion in node-aligned multiplex networks. One can examine such a diffusion process analytically by calculating the eigenvalues of a combinatorial supra-Laplacian matrix. Gómez *et al.* derived the asymptotic behaviour of the eigenvalues of the combinatorial supra-Laplacian of a two-layer network for both strong coupling and weak coupling between layers. (Two layers are ‘strongly coupled’ if the weights of the inter-layer edges are large, and they are ‘weakly coupled’ if those weights are small.) De Domenico *et al.* [69] introduced several types of random walks on multiplex networks with heterogeneous coupling strengths between nodes. In Fig. 8(f), we illustrate an example of a random walk on a multiplex network. They demonstrated that the time that is needed for a random walker to reach most of the nodes in a multiplex network depends on the topology of the intra-layer networks, the inter-layer connection strengths, and the type of the random walk. This time can either lie between the times that are necessary to cover each of the layers separately (i.e. the walk is ‘intra-diffusive’) or it can be smaller than each of those times (i.e. the walk is ‘super-diffusive’).

The results in Ref. [148] on the algebraic connectivity of the combinatorial supra-Laplacian also have important implications for diffusion on multiplex networks, as how closely the separate layers interact can be very important for diffusive processes (as well as for other dynamics). Reference [381] studied diffusion in ‘weakly-coupled’ interconnected networks in which there are only a few inter-layer edges. They were able to separate the diffusion into a fast process that takes place inside of the layers and a slow process that takes place between layers.

The susceptible-infected (SI) model is a percolation process in which infected nodes stay that way forever [1]. Reference [382] studied a variant of the SI model in which there are several different infections (i.e. ‘lexical innovations’) that compete against each other in a node-coloured network in which one layer represents the media and the other represents a social network. Another type of spreading model, which can be used for applications such as information spreading and behaviour adoption, is a so-called ‘complex contagion’ [383]. Such processes have also been studied on multiplex networks (see Section 4.6.5).

#### Coupled-cell networks.

It is possible to classify the temporal evolution of coupled systems of ordinary differential equations by using *coupled-cell networks* [384,182,181] (see Section 2.8), which can exhibit rich (and sometimes rather surprising) bifurcations and have received a lot of attention in the dynamical-systems community. In coupled-cell networks, a node is associated with a dynamical system, and two nodes have the same ‘colour’ if they have the same state space and an identical dynamical system. The edges (and hyperedges) in these networks represent the couplings between the dynamical systems, and two edges have the same ‘colour’ if the couplings are equivalent. If two nodes have the same colour and the edges that are incident to those nodes all have the same colour, then the differential equations at each node are identical. Additionally, for a given network, the formalism of coupled-cell networks automatically gives a set of ‘admissible’ vector fields. (That is, which vector fields are sensible to consider is a natural byproduct of the structure that has been defined.) A coupled-cell network is *homogeneous* if all of the nodes have the same colour and also have the same number of incoming edges of each colour (i.e. it is a special case of a multiplex network), and it is *regular* if all of the edges have the same colour (i.e. it is a monoplex network).

Most studies of coupled-cell networks have been concerned with systems that have a small number of nodes. A major thread has entailed the derivation of generic results that relate the qualitative behaviour of a dynamical system to the topology of the corresponding coupled-cell network [385]. For a monoplex network, synchrony-breaking bifurcations in coupled-cell networks are related to the eigenspace of the network's adjacency matrix. Similarly, the structure of synchrony-breaking bifurcations in a homogeneous coupled-cell network constructed from a Cartesian product of two graphs is related to a tensor product of the eigenspaces of the adjacency matrices of the two original graphs [386].

The careful mathematical study of coupled-cell networks has also yielded many interesting (and mathematically rigorous) results in addition to the aforementioned bifurcation phenomena. Indeed, in order to conduct a generic bifurcation analysis in a coupled-cell network, it is first necessary to classify its ‘patterns of synchrony’^{35} [181,182]. Perturbing the system leads to a new system with the same pattern of synchrony if and only if the colouring satisfies a certain combinatorial condition (which does not depend explicitly on the vector field under consideration) [182]. More generally, the formalism of coupled-networks has led—often after many years of effort [388]—to results such as a mathematically rigorous classification of ‘phase-shift synchrony’^{36} of periodic solutions in (admissible) dynamical systems (i.e. without needing to worry about the specific form of a vector field) in terms of combinatorial conditions on network structure.

As we have illustrated above, the study of coupled-cell networks has been very useful for illuminating rich bifurcation phenomena, and additional studies of such phenomena—as well as spiritually similar studies of rich, generic phase-transition phenomena that can occur in multilayer networks [148,141]—are important research directions.

#### Other types of dynamical systems.

Obviously, there are numerous types of dynamics that can occur on multilayer networks. In particular, popular processes like percolation cascades and compartmental spreading models are not the only ones that have received attention. Multilayer structures have important consequences for dynamical processes in general, and those consequences (and the magnitude of their importance) can differ for different processes.

The impact of multilayer organization has been studied for a wide variety of dynamical processes. For example, it has been reported that multiplex random Boolean networks can be stable for parameter values for which a single layer of a multiplex network in isolation would be unstable [80].^{37} Reference [389] examined ecological dynamics on a multilayer network that includes both social and landscape effects, and voter models have also been studied recently on multilayer networks [390]. Additionally, several papers have considered models from game theory. In evolutionary game theory, the coupling of multiple networks can increase cooperation in games such as the Prisoner's Dilemma [104,391,392]. For example, it has been reported that one can couple two networks in which the nodes play Prisoner's Dilemma games against their neighbours such that it is optimal to have a nontrivial number (i.e. neither 0 nor the maximum number) of inter-layer edges to facilitate cooperative behaviour [393,394]. It has also been illustrated that the coupling of network layers can increase cooperation in other game-theoretic models (e.g. in a public-goods game in a coupled pair of square lattices [395–397]). Another study examined coevolution between strategy and network structure for games on multilayer networks [398].

The Watts threshold model [399], which is a (percolation-like) complex-contagion process for the adoption of ideas, has also been generalized for multiplex networks [70,71,400,401]. For example, there exist means for multiplex networks to be more vulnerable to global adoption cascades than monoplex networks [71]. Other percolation processes have also been generalized to multilayer networks. For example, Azimi-Tafreshi *et al.* [402] recently studied $$k$$-core percolation on multiplex networks.

There have also been several studies of synchronization on multilayer networks. As expected, several papers have focused on Kuramoto phase oscillators on multilayer networks [403]. For example, Ref. [117] examined Kuramoto oscillators on two-layer networks in which the inter-layer coupling includes a delay. A very recent study examined interactions between Kuramoto oscillators and a biased random walk on multilayer networks [404]. Additionally, a perspective based on a network of networks—where oscillators in a population are coupled to each other via some network structure and then these populations are in turn coupled to each other in a network—was used several years ago (before networks of networks became popular), with an eye towards applications in neuroscience, to examine synchronization of coupled nonlinear oscillators [62,63,405–407].

Brummitt *et al.* [408,122] studied a sandpile model on interconnected (i.e. node-coloured) networks in order to model cascading failures on a pair of coupled power grids. In their study, they reported that there is an optimal fraction of interconnected node pairs between the two networks that minimizes the largest cascades. Reference [290] studied routing and traffic congestion in interconnected networks. That is, they considered a process in which nodes send ‘packets’ to other nodes that are routed through the network such that each node can only forward a number of packets, corresponding to its capacity, to its (intra-layer or inter-layer) neighbour nodes at each time step. Reference [409] examined a routing model in which the load on a node is equal to its value of geodesic betweenness centrality (where the paths through the network can traverse either inter-layer and intra-layer edges). At each time step, nodes whose betweenness values exceed their capacity fail; they are then removed from the network, and the betweenness values are recalculated. Zhang *et al.* [410] examined a similar dynamical process on two-layer interdependent networks. However, in their study, paths can only traverse intra-layer edges, and a node's failure in one layer can cause a node that depends on that node in the other layer to fail in a matter that is reminiscent of the cascading failures in Ref. [57] (see Section 4.6.2). Morris and Barthelemy [240] also examined a similar phenomenon (but without cascading failures) in the context of two-layer transportation systems. In subsequent work [411], they studied cascading failures in two-layer electrical networks in which one layer represents a power grid and the other layer represents a control network that monitors the edges in the power lines.

#### Control and dynamics.

The study of dynamical systems has intricate connections with control theory, in which one examines dynamical systems with feedback and inputs. A control-theoretic perspective can be very insightful for the investigation of dynamical systems on networks [412–415], and it will certainly also be important for dynamical systems on multilayer networks. For example, one might desire to achieve a desired state—such as the synchronization (or some other desired behaviour) of oscillators that are associated with each node—so how should one apply low-cost decentralized control strategies to ensure that this occurs as rapidly as possible?

Indeed, many of the concepts that we have discussed above can be viewed through the lense of control theory. For example, a control-theoretic perspective has been used to examine the temporal evolution of two-layer networks in which a ‘control network’ is used to influence an ‘open-loop network’ (which, by definition, does not include feedback by itself) [412]. It is also useful to view ‘pinning control’ [416], in which one controls only a small fraction of a network's nodes directly in order to try to influence the dynamics of other nodes, in the context of interconnected networks. For instance, Wu *et al.* [417] recently examined pinning control and synchronization in networks that are both node-coloured and edge-coloured. A population of controllers interacts with a population of followers, and it is important to distinguish between follower–follower edges and controller–follower edges.

As Refs. [227,228] illustrated in the context of biochemical systems, one can also use ideas from control theory to decompose networks into different layers [226]. This is somewhat reminiscent of community detection, but the layers need not be based on trying to find densely connected sets of nodes.

## Conclusions and outlook

The study of multilayer networks—and of frameworks like multiplex networks and interconnected networks in particular—has become extremely popular, and it's easy to see why. Most real and engineered systems include multiple subsystems and layers of connectivity, and developing a deep understanding of multilayer systems necessitates generalizing ‘traditional’ network theory. Ignoring such information can yield misleading results, so new tools need to be developed. One can have a lot of fun studying ‘bigger and better’ versions of the diagnostics, models, and dynamical processes that we know and (presumably) love—and it is very important to do so—but the new ‘degrees of freedom’ in multilayer systems also yield new phenomena that cannot occur in single-layer systems. Moreover, the increasing availability of empirical data for fundamentally multilayer systems amidst the current data deluge also makes it possible to develop and validate increasingly general frameworks for the study of networks.

In the present article, we discussed the history of research on multilayer networks and related frameworks, and we reviewed the exploding body of recent work in this area. Numerous similar ideas have been developed in parallel, and the literature on multilayer networks has rapidly become extremely messy. Despite a wealth of antecedent ideas in subjects like sociology and engineering, many aspects of the theory of multilayer networks remain immature, and the rapid onslaught of papers (especially from the physics community) on various types of multilayer networks necessitates an attempt to unify the various disparate threads and to discern their similarities and differences in as precise a manner as possible. Towards this end, we presented a general framework for studying multilayer networks and constructed a dictionary of terminology to relate the numerous disparate existing notions to each other. We then provided a thorough discussion to compare, contrast and translate between related notions such as multilayer networks, multiplex networks, interdependent networks, networks of networks and many others. We also discussed existing data sets (including some old ones) that can be represented as multilayer networks and can thus be used to test new ideas in this area.

We reviewed attempts to generalize single-layer network diagnostics, methods, models and dynamical systems to multilayer settings. An important theme that has developed in the literature is the importance of multiplexity-induced correlations (and its analogs in multilayer structures more generally), which can have important ramifications for dynamical processes on networks. For example, such correlations can have significant effects on the speed of transmission of diseases and ideas as well as on the robustness of systems to failure. Multilayer structures induce new degrees of freedom—such ‘new physics’ is already prominent even for multilayer networks with only a single aspect—but they remain poorly understood. We already know that these new degrees of freedoms have important consequences for dynamical processes, and it is crucial to develop a precise understanding of the situations and processes for which such effects (and the magnitude of their importance) are qualitatively different.^{38} However, just as the trickle of monoplex-network data sets eventually became a flood, we expect to see many more interesting multilayer data sets in the near future. These, in turn, will help scholars to develop new theories, methods and diagnostics for gaining insight into multilayer networks.

The study of multilayer networks is very exciting, and we look forward to what the next several years will bring.

## Funding

All authors were supported by the European Commission FET-Proactive project PLEXMATH (Grant No. 317614). A.A. also acknowledges financial support from the ICREA Academia, Generalitat de Catalunya (2009-SGR-838), the James S. McDonnell Foundation, and FIS2012-38266. J.P.G. acknowledges funding from Science Foundation Ireland (grants 11/PI/1026 and 09/SRC/E1780). Y.M. was also supported by MINECO through Grants FIS2011-25167 and by DGA (Spain). M.A.P. acknowledges a grant (EP/J001759/1) from the EPSRC.

## Acknowledgements

We thank Sergey Buldyrev, Moses Boudourides, Lidia A. Braunstein, Ron Breiger, Javier Buldú, Rajmonda Caceres, Kathleen Carley, Davide Cellai, Emanuele Cozzo, Valentin Danchev, Manlio De Domenico, Mario di Bernardo, Péter Erdos, Terrill Frantz, Marty Golubitsky, Peter Grindrod, Adam Hackett, Shlomo Havlin, Des Higham, Vincent Jansen, Hang-Hyun Jo, János Kertész, Dan Larremore, Ian McCulloh, Sandro Meloni, Jim Moody, Peter Mucha, Callum Oakley, Chiara Poletto, Tom Prescott, Ioannis Psorakis, Garry Robins, Joaquín Sanz, Tom Snijders, H. Eugene Stanley, Barry Wellman, and Alvin Wolfe for helpful comments and suggestions. We also thank the members of the COSNET Lab and participants in University of Oxford's Networks Journal Club for their feedback and the referee (and/his her team of graduate students and postdoctoral scholars) for helpful comments.

Copyright notice for panel (f) of Fig. 7 and panels (a)–(d) of Fig. 8: ‘Readers may view, browse, and/or download material for temporary copying purposes only, provided these uses are for noncommercial personal purposes. Except as provided by law, this material may not be further reproduced, distributed, transmitted, modified, adapted, performed, displayed, published, or sold in whole or part, without prior written permission from the American Physical Society’.

### Appendix. Glossary and Notation

Term/symbol | Explanation |
---|---|

Multilayer network | General term for a network with multiple layers (see Section 2.1) |

Multiplex network | Multilayer network with diagonal couplings (see Section 2.5) |

Monoplex network | Network with a single layer; aka: ‘singleplex’, ‘uniplex’, ‘singlex’, ‘simplex’ network |

Node-layer tuple | Tuple that specifies both node and layer; often called ‘node-layer’ as a shorthand; it is a node in a supra-graph (see Section 2.1) |

Aspect | A ‘dimension’ of layers (see Section 2.1) |

Supra-adjacency matrix | Matrix representation of a multilayer network (see Section 2.3) |

Adjacency tensor | Tensor representation of a multilayer network (see Section 2.2) |

Node-aligned | All nodes are shared between all layers (see Section 2.1) |

Layer-disjoint | Each node is present only in a single layer (see Section 2.1) |

Diagonal couplings | Inter-layer edges are only between nodes and their counterparts (see Section 2.1) |

Layer-coupled | Diagonal couplings that are independent of the nodes (see Section 2.1) |

Categorical couplings | Diagonal couplings for which all possible inter-layer edges are present (see Section 2.1) |

Ordinal couplings | Diagonal couplings with inter-layer edges only between nodes in neighbouring layers (see Section 2.7) |

$$d$$ | Number of aspects (i.e. the ‘dimensionality’ of the layers) |

$$n$$ | Number of nodes in one layer in a node-aligned network |

$$b$$ | Number of layers in a network with a single aspect |

$$\textbf {L}$$ | Sequence of sets of layers |

$$L$$ | Set of layers in a network with a single aspect |

$$\textbf {A}$$ | Adjacency matrix |

$$\textbf {W}$$ | Weighted adjacency matrix |

$$\mathcal {A}$$ | Adjacency tensor |

$$\mathcal {W}$$ | Weighted adjacency tensor |

$$u,v$$ | Nodes (and node indices) |

$$\alpha ,\beta $$ | Layers (and layer indices) |

$$\boldsymbol {\alpha },\boldsymbol {\beta }$$ | Sequences of layer indices (e.g. $$\boldsymbol {\alpha }=\alpha _1, \dots , \alpha _{d}$$ or $$\boldsymbol {\alpha }=\alpha _1 \dots \alpha _{d}$$) |

Term/symbol | Explanation |
---|---|

Multilayer network | General term for a network with multiple layers (see Section 2.1) |

Multiplex network | Multilayer network with diagonal couplings (see Section 2.5) |

Monoplex network | Network with a single layer; aka: ‘singleplex’, ‘uniplex’, ‘singlex’, ‘simplex’ network |

Node-layer tuple | Tuple that specifies both node and layer; often called ‘node-layer’ as a shorthand; it is a node in a supra-graph (see Section 2.1) |

Aspect | A ‘dimension’ of layers (see Section 2.1) |

Supra-adjacency matrix | Matrix representation of a multilayer network (see Section 2.3) |

Adjacency tensor | Tensor representation of a multilayer network (see Section 2.2) |

Node-aligned | All nodes are shared between all layers (see Section 2.1) |

Layer-disjoint | Each node is present only in a single layer (see Section 2.1) |

Diagonal couplings | Inter-layer edges are only between nodes and their counterparts (see Section 2.1) |

Layer-coupled | Diagonal couplings that are independent of the nodes (see Section 2.1) |

Categorical couplings | Diagonal couplings for which all possible inter-layer edges are present (see Section 2.1) |

Ordinal couplings | Diagonal couplings with inter-layer edges only between nodes in neighbouring layers (see Section 2.7) |

$$d$$ | Number of aspects (i.e. the ‘dimensionality’ of the layers) |

$$n$$ | Number of nodes in one layer in a node-aligned network |

$$b$$ | Number of layers in a network with a single aspect |

$$\textbf {L}$$ | Sequence of sets of layers |

$$L$$ | Set of layers in a network with a single aspect |

$$\textbf {A}$$ | Adjacency matrix |

$$\textbf {W}$$ | Weighted adjacency matrix |

$$\mathcal {A}$$ | Adjacency tensor |

$$\mathcal {W}$$ | Weighted adjacency tensor |

$$u,v$$ | Nodes (and node indices) |

$$\alpha ,\beta $$ | Layers (and layer indices) |

$$\boldsymbol {\alpha },\boldsymbol {\beta }$$ | Sequences of layer indices (e.g. $$\boldsymbol {\alpha }=\alpha _1, \dots , \alpha _{d}$$ or $$\boldsymbol {\alpha }=\alpha _1 \dots \alpha _{d}$$) |

*graph alignment*, which refers to the problem of renaming nodes of two or more graphs such that the graphs become as similar to each other as possible.

*intra-layer walk*is a walk that occurs only within a single layer, and an

*intra-layer path*is defined similarly.

*et al.*called their networks ‘overlay networks’, and Buono

*et al.*used the monicker ‘partially overlapped multiplex networks’.

## References

*k*similarity search in heterogeneous information networks