Dimension matters when modeling network communities in hyperbolic spaces

Abstract Over the last decade, random hyperbolic graphs have proved successful in providing geometric explanations for many key properties of real-world networks, including strong clustering, high navigability, and heterogeneous degree distributions. These properties are ubiquitous in systems as varied as the internet, transportation, brain or epidemic networks, which are thus unified under the hyperbolic network interpretation on a surface of constant negative curvature. Although a few studies have shown that hyperbolic models can generate community structures, another salient feature observed in real networks, we argue that the current models are overlooking the choice of the latent space dimensionality that is required to adequately represent clustered networked data. We show that there is an important qualitative difference between the lowest-dimensional model and its higher-dimensional counterparts with respect to how similarity between nodes restricts connection probabilities. Since more dimensions also increase the number of nearest neighbors for angular clusters representing communities, considering only one more dimension allows us to generate more realistic and diverse community structures.


Introduction
attention has shifted to the study of higher-dimensional models (16)(17)(18)(19). In these, there is still one radial coordinate for popularity, but there are D > 1 dimensions encoding similarity, or perhaps, similarities. In other words, higher-dimensional hyperbolic network models embody the intuition that there is more than one way in which things can be similar or not. The choice of dimension is an already prominent problem for machine learning applications of hyperbolic embeddings (20), yet is has mostly been overlooked until recently for hyperbolic network models. In recent works, increasing the dimension was convoluted with the effect of other parameters and studied only at the local scale of node pairs connectivity and expected degrees (16,18,19). These studies also found that similar power-law degree distributions can be achieved in any dimensions, by tuning the choice of model's parameter regime.
These observations involve extremely local properties, concerning nodes and their direct neighbors. As soon as we start zooming out towards the mesoscale level, dimension seems to impact network topology, yet it has not been studied much so far. The maximum clustering coefficient that can be achieved in random geometric graphs, which quantifies the closure of triplets of nodes into triangles, decreases with the dimension (21,18,19). Almagro et al. (17) have shown using the short cycle structure that various networked datasets have an underlying hyperbolic dimension that is ultra-low, albeit not minimal as previously assumed. Our work follows this line of research by looking at this question through the lens of community structure.
One of the most ubiquitous properties of complex networks is community structure, when connectivity within subgroups of nodes, or communities, is prominently different than with the rest of the network (22,23). Network motifs have long been recognized as universal building blocks of complex networks (24), community structure detection is one of the most long-standing active fields of network science (25,26), and most networked data show some sort of community structure, with one of the most interesting complex systems, the brain, making no exception (27,28). In hyperbolic network models, community structure is expressed as subgroups of nodes that are closer with respect to their angular coordinates, either in hyperbolic embeddings of real networks (1,29) or synthetic models that explicitly generate communities (14,13,15). Since changing the dimension of hyperbolic network models affects primarily the number of angular coordinates, dimension and community structure are much closer than they appear. In our work, we explore this interplay to see how dimension can improve current modeling of community structure in the hyperbolic framework, but also how community structure offers some insight about the underlying hyperbolic dimension of networks.
Let us take a look at a hyperbolic embedding of the airport network in Fig. 1. Airports within the same continent are more prone to be connected by direct flights, which is why nodes of the same color are mostly grouped by angular coordinate. Nevertheless, airports in Africa, Asia, and Europe seem to be more mixed together because their actual geographic location has, very broadly, the shape of a triangle which cannot be reflected in the maximumlikelihood embedding of Fig. 1. This phenomenon is also notable in structural brain networks hyperbolic embeddings of Ref. (2), where neuroanatomical clusters are not all grouped by angular coordinates. These examples illustrate that a unique similarity dimension might not be enough to model community structure with non-sequential patterns. If that were the case, one could wonder why is that so, and how can we quantify community structure in hyperbolic random graphs?
As to why, we find that going from one angular similarity dimension to more, and even to only one more, have drastic effects on how the similarity between nodes influences their connectivity. We show that angular closeness constrains connections much more, and differently, in D = 1 than in other dimensions, which is done by studying the probability density function of the angular distance between connected nodes. We also quantify how the number of neighbors is not the same on a circle or on a sphere or on an even higher-dimensional sphere. This phenomenon impacts the number of nearest neighbors for angular clusters representing communities in hyperbolic random graphs. As a simplest experiment of how those two phenomenons come into play, we generate hyperbolic networks possessing community structure in D = 1 and D = 2. We thus obtain insights into how and why networks with community structure might have an underlying hyperbolic dimension that is higher than one.
The paper is structured as follows. In Sec. "Hyperbolic networks model," the key properties of the H D+1 hyperbolic random graph model are recalled, along with its relationship to the S D formulation and some remarks about angular distance on D-spheres. The interplay between distance and dimension is studied in Sec. "Effects of dimensionality," where we will see how dimension affects how connected nodes in hyperbolic random graphs are likely to be found at a given angular distance from one another. In either the pairwise case, where the expected degrees of the nodes have to be considered, or in the general case, for any expected degree distribution, the probability of finding connected nodes at a certain angular distance presents a sharp contrast between D = 1 and D ≥ 2. Then, we digress briefly on the number of neighbors on D-spheres. In the last part of the paper, we show how this affects the possibility to generate hyperbolic networks with community structure. This is quantified on block matrices representing how the generated communities are related to one another in Sec. "Impacts on community structure."

Hyperbolic networks model
We first review basic notions and establish some useful notation about hyperbolic random graphs in any dimension, before presenting some of their most remarkable properties.
A hyperbolic space is a complete, simply connected, Riemannian manifold of constant negative curvature −ζ 2 and  (29). Edges represent pairs of airports connected by a direct flight and colors represent the continent on which the airport is located. Here, edges follows hyperbolic geodesics of the conformal disk model (30). Adapted from Fig. 4a in Ref. (29) using data downloaded from. openflights.org. dimension D + 1 (30, Chap. 8 31). The lowest dimensional space with D = 1 is the hyperbolic plane, a smooth surface that can be modeled as one sheet of a hyperboloid in the three-dimensional Minkowski space, but also using other equivalent models like the upper half-plane, the Klein disk and the Poincaré disk (30,32).
Hyperbolic random graphs are based upon the hyperboloid model H 2 , where all points of the hyperbolic plane are parametrized using coordinates φ ∈ [0, 2π) and r ∈ [0, R) and thus have a natural projection on a circle through coordinate φ (33). An analogous coordinate parametrization is used in higher dimension. The angular coordinate then map points to the D-sphere instead of the circle, i.e. φ = (φ 1 , . . . , φ D ), with φ 1 , . . . , φ D−1 ∈ [0, π) and φ D ∈ [0, 2π) (16,30 Sec. 3.4). The distance d h between two points x, x ′ ∈ H D+1 whose respective coordinates are (φ, r) and (φ ′ , r ′ ) is given by the hyperbolic law of cosines, where ζ ∈ (0, ∞) is related to the hyperboloid's curvature −ζ 2 and θ = acos(x · x ′ /|x| |x ′ |) is the angular distance between x and x ′ , here considered as two vectors in R D+1 . This generalization is referred to as the hyperboloid model H D+1 . One important aspect of Eq.
(1) is that, for sufficiently large r, r ′ and for small θ, where the first expression becomes exact in the large network limit and the last expression holds for small θ. This approximation was first published in Ref. (33), but one can also refer to Refs. (16,34) for more detailed derivations and bounds on the approximation. What is referred to as the H D+1 model throughout this paper is not only the hyperbolic space presented above, but a random graph defined on this space. Consider a set of N nodes, where each x i is a continuous random variable on H D+1 . A natural choice to study the effect of hyperbolic geometry on the graph is to sample uniformly in a subset of H D+1 , but most models can also generate networks with other node densities on the hyperbolic disk, which is reflected in the diversity of possible node degree distributions that have been studied (35,33,16). The connection probability defines a random graph with node set V, where each edge is an independent Bernoulli random variable with chance of success p h (x, x ′ ). We stress that there are two levels of randomness. First, the nodes' positions are sampled from a continuous distribution on H D+1 . Second, each realization of those positions defines a discrete probability measure on the set of all simple graphs of size N. For uniformly distributed nodes, shortest paths on graphs sampled from H 2 follow the shortest paths on the underlying hyperbolic space H 2 with high probability, a phenomenon first observed in (36,33). This phenomenon is referred to as hyperbolic routing or congruence between the graphs and their underlying space (3), and is yet to be studied in D > 1. Two additional parameters are introduced in Eq. (3): μ > 0 that sets a connectivity distance threshold and tunes the expected average degree when nodes are sampled uniformly (33) and β > 0 that controls for the range of connection probabilities. Akin to the phase transition at β = 1 in the original model (33), there is a phase transition at β/D = 1, a critical value for which uniform hyperbolic random graphs have different average expected degree and power-law exponent of the degree distribution, in the asymptotic limit of large graphs (19,37,18). Our work takes place in the so-called "cold" regime, β > D, which has been shown to generate graphs with power-law degree distributions and low average degree. It also follows that the ratio β/D is kept constant when comparing models of different dimensions in Secs. "Effects of dimensionality" and "Impacts on community structure," in order to study different dimensions in the same asymptotic regime.
With the model now explicitly defined, let us now take at closer look at which node properties the radial and angular coordinates abstract. As mentioned in the Introduction, the radial coordinate encodes nodes popularity since the closer a node is to the center of the hyperbolic ball, the higher its expected degree will be (33,5). The angular coordinates abstract an ensemble of similarity attributes, properties of the data underlying the network that drive nodes to be connected or not, independently of their degree. Those attributes do not have to be known explicitly, and can be related in a non-trivial way, in a manner reminiscent of how some data can be abstracted by principal components or eigenvectors. This is why detecting how many similarity dimensions would be needed to adequately model a given networked dataset is of research interest (17,20). The random graph model can also be defined in the S D representation (35), using the same angular coordinates that maps the nodes to a D-sphere S D = {x ∈ R D+1 | |x| 2 =R 2 }, but assigning to each node a new continuous random variable, the latent degree κ ∈ (κ 0 , ∞), instead of the radial coordinate r. The following change of variables, a D-dimensional generalization of what was first done in (38,33), transforms from r to κ and inversely through In D = 1, this simple mapping has been reinstated regularly over the last decade to highlight the correspondence between both models (29,3), yet to the best of our knowledge it had not been previously published in arbitrary dimension. Introducing one last parameter μ, we can set μ = (2/ζ ) log [2R/(μκ 2 0 ) 1/D ] in Eq. (3) and using the hyperbolic distance approximation of Eq. (2), we obtain the connection probability as originally defined in the latent variables formalism (35). Albeit having different values for the equivalence between connection probabilities to hold, parameters R and μ in the S D model play a similar role, respectively, with regards to nodes's density on the space and the mean degree, as R and μ in the H D+1 model, which justifies this choice of notation. a For large sparse graphs with uniformly sampled angular coordinates, the parameter μ can be tuned such that the expected degree of a node with latent degree κ (sampled from any distribution) is proportional to the value of κ (35,39), where the expectation is over all possible realizations of S D , hence the name "latent degree." In addition to this relation between latent and expected degrees, Eq. (5) highlights how angular coordinates encode similarity and latent degrees encode popularity, since the connection probability gets closer to 1 for small angular distance θ and high latent degrees κ, κ ′ . For this reason, the D-sphere parameterized by angular coordinates is sometimes referred to as the similarity space. It is worth mentioning that in the literature, the notation is often simplified by adequately fixing parameters, which has not been done heretofore to keep the relationship between cited articles more accessible.
The change of variables defined in Eq. (4) preserves the connection probabilities between nodes whenever Eq. (2) is valid. For uniformly distributed nodes on a ball of H D+1 , it has been shown that the proportion of node pairs for which this is true tends to 1 (3). Thus, random graph models within either S D or H D+1 are considered equivalent. In our work, the S D representation is used without loss of generality.
Given latent degrees κ, κ ′ , we can define a new continuous random variable, whose outcome depends on the probability density function (pdf) of latent degrees, as well as parameters μ and R . Letting go of the explicit dependency on κκ ′ for brevity, the connection probability between a pair of nodes, given by Eq. (5), can be written as which highlights that η acts as a local angular distance connectivity threshold. Indeed, where H is the Heaviside step function. Thus, in the β → ∞ regime, all node pairs for which θ < η would be connected. In the S D representation, it is common to fix the radius R such that the number of nodes N is equal to the surface area of the D-sphere (35,29), which yields a node density of 1 witĥ In this setting, η varies as (κκ ′ /N) 1/D . Thus, we can think of η as capturing the pairwise popularity in a way reminiscent of how the angular distance θ captures the similarity of node pairs. A common choice of pdf for angular coordinates is such that points parameterized by φ are uniformly distributed on S D . For latent degrees, we consider a Pareto distribution with mean ̅ κ, which, with γ = 2ζ + 1, is akin to sampling nodes uniformly from a hyperbolic ball of H D+1 , but offer more freedom on the shape of the degree distribution. b Increasing the dimension of spheres is far from being as intuitive as unfolding more dimensions of flat spaces. Let us picture this using angular distance distributions on D-spheres. Let X be a random variable describing the angular distance θ ∈ [0, π] between two points sampled uniformly at random on S D . As derived in (40), the pdf of X is given by with Fig. 2 shows that for D ≫ 1, chances are that any other point will be found at more or less an angular distance of θ = π/2, a surprising property related to the concentration of measure phenomenon (41). Yet even for very low D, there is a significant qualitative shift between uniformity in D = 1 and unimodality in D ≥ 2. This well-known property hints intuition for the upcoming section, where instead of sampling pairs of points on D-spheres, we study edges of hyperbolic random graphs.

Effects of dimensionality
Dimension is a fundamental property of spaces. An ant living on a circle would have much less freedom than one living on a sphere. Yet, if those two ants were to talk to each other, they might agree that they live on the same kind of space because some properties, like the possibility to come back to the same point by going straight ahead long enough, are the same. As presented in Sec. "Hyperbolic networks model," increasing the dimension of hyperbolic random graphs boils down to considering more angular coordinates that map to the sphere or higher dimensional D-spheres, but how does this impact the graph structure? Some properties of the graphs are almost unchanged, like the degree distribution (16), while some others, like the short cycle structure (17), are affected significantly. Our take on this question broadly deals with neighborhoods, of connected nodes in the graph but also simply of points on D-spheres. Since the dimension affects primarily the angular similarity space, the focus is first on uniform distribution of nodes on S D and the study of angular distance between nodes that are connected. On another note, we then study how the number of nearest neighbors on D-spheres varies with dimension.

Angular distance between connected nodes
In S 1 , most edges are observed between nodes separated by a very small angular distance, except for nodes of very high expected degree (see for instance (33 Fig.5)). We have found this propriety to change with the dimension of the underlying hyperbolic space. This is to be expected since the probability to find any other node at a very close angular distance decreases with dimension, as shown in Fig. 2. Yet, this effect of dimensionality has interesting consequences on hyperbolic random graphs, especially when comparing D = 1 and D > 1.
Consider a S D model of N nodes as defined in Sec. "Hyperbolic networks model" with angular coordinates sampled uniformly with respect to the spherical measure and latent degrees sampled from any pdf f K (κ). We study the distribution of the angular distance between connected nodes within this hyperbolic random graph, as a way to assess the interplay between the dimension of the similarity space and the topology of the graph. A pairwise case of two nodes with given latent degrees is first examined, then its generalization to the whole latent degree distribution. The hard threshold limit described by Eq. (8) is also studied since it allows for some insightful approximations of our results.

For a pair of nodes, general case
Consider a pair of nodes with latent degrees κ, κ ′ such that η is given by Eq. (6). We study the pdf of angular distance between those two nodes, provided that they are connected in the random graph. Let X be the continuous random variable describing the angular distance θ ∈ [0, π] between these nodes. Since the angular coordinates of the graph are uniformly distributed, the pdf of X is given by Eq. (11). Let Y be the continuous random variable describing the value of η, whose support depends on possible values of κ. Finally, let A be the discrete Bernoulli random variable describing whether the nodes are connected (A = 1) or not (A = 0) according to the probability of Eq. (5). The pdf of angular distance between two nodes is given by the conditional pdf with the left-hand side notation standing more compactly for . Since A depends on θ and η through the connection probability of Eq. (5), but random variables are otherwise independent, the numerator of Eq. (13) is given by Likewise, the denominator is where f A|Y (1 | η) is the following density, with X marginalized out: Altogether, this yields the explicit expression for Eq. (13) with normalization Intuitively, Eq. (17) is proportional to sin D−1 θ as in the distance distribution for uniformly distributed nodes on S D of Eq. (11), while allowing for the influence of hyperbolic connection probability given by Eq. (7). Its behavior for D ∈ {1, 2, 3, 4, 5} is shown in Fig. 3, where for all dimensions the expected degree of nodes is kept the same. In Figs. 3 and 4, we have set β/D = 3.5 without loss of generality, since our results are not qualitatively affected by this parameter choice, as long as β/D > 1.
For D = 1, the maximum is at θ = 0, as depicted in the inset by the purple example connectivity matrix c , where the dark diagonal line illustrates all of the connection probability concentrated to very small angular distances. The pdf of the angular distance separating connected nodes is strictly decreasing for D = 1 and β < ∞, since it is proportional to the connection probability of Eq. (5). Hence, most edges will have nearly θ = 0 in D = 1, as expected in the H 2 generative model (33) and shown by the purple curve in Fig. 3. By contrast, for D > 1, this is relaxed, as exemplified by the blue curve and connectivity matrix for D = 2.
Eq. (17) is unimodal for all D > 1, with a mode greater than zero and increasing with dimension. The existence and the location of the mode of Eq. (17) for D > 1 is deduced by setting its first derivative with respect to θ to zero to obtain where θ * ∈ (0, π) can be a maximum, a minimum or an inflection point. Since Eq. (17) is the product of a unimodal function and a decreasing function, we assume the critical point above is a maximum. It is shown in Supplementary Material §1.A. that θ * = 0 iff D = 1. Furthermore, if θ * ≪ 1 such that tan θ * ≈ θ * , the solution of Eq. (19) can be closely approximated by This approximation is valid whenever the angular threshold η ∼ (κκ ′ /N) 1/D is small enough, which is the case for a significant fraction of node pairs in hyperbolic random graphs (33) and most observed degree distributions (42). Both factors of Eq. (20) are increasing functions of the dimension D. Therefore, for the vast majority of nodes having a "reasonable" expected degree, the most likely angular distance between connected nodes in S D is greater and greater as dimension increases. This is yet another indicator that most connected nodes will in fact be separated by an angular distance θ > 0 in hyperbolic random graphs S D with D > 1. The expression (17) encapsulates in a simple yet accurate way that the joint effect of nodes' angular distribution on D-spheres with the hyperbolic connection probability is qualitatively and quantitatively different across ultra-low dimensions of the model.

For a pair of nodes, hard threshold limit
An informative limiting case for the angular distance between connected nodes arises with β → ∞. Then, the connection probability becomes a step function given by Eq. (8) with angular distance threshold η. Equation (17) becomes where the second approximation uses the small angle property of sine. In Fig. 3, the dashed line shows the exact value of Eq. (17) with β → ∞, but this is exactly the behavior expected from the truncated powers of θ given in the last expression of Eq. (21). There is a sharp maximum at the threshold η for all D > 1, which means quite counterintuitively that in this limit, most connected nodes will be separated by their local maximal angular distance η.
This highlights a stark contrast between S 1 and S D , D > 1, in a regime where the underlying hyperbolic geometry is the most binding to the topology of the graph because any pair of nodes that are close enough according to their degrees and model parameters shall be connected.

For all latent degrees, general case
Let us now zoom out of a specific pair of nodes and study the angular distance between connected nodes considering the entire latent degree distribution. To marginalize η out of Eq. (17), one would need to compute where the integral is computed over all possible values of η. If we sample a graph from a S D model, and then draw an edge uniformly at random, this is precisely the pdf of the angular distance between nodes creating that edge. Using Baye's theorem, we can write Then, combining Eqs. (13)- (15) and (23) yields the following explicit form for Eq. (22): To further study the angular distance between connected nodes, the marginal pdf of η has to be computed given the pdf of κ, which is done in Supplementary Material  §1.B, along with the computation for f A (1) in §1.C. Fig. 4 illustrates the behavior of Eq. (24) for D ∈ {1, 2, 3, 4, 5}, with expected degrees that follow the same Pareto distribution in all dimensions. The differences between D = 1 and D > 1 found in the pairwise case are still notable when we consider the whole distribution. Those are the concentration of connected nodes at very small angular distances for D = 1, the mode shifting towards higher angular distances with increasing dimension and the qualitative difference between the shape of distributions between D = 1 and D > 1. This effect of dimensionality, somewhat expected on its own, is to be interpreted in the light of other properties of higher-dimensional spaces studied in the following sections.

Number of nearest neighbors
Dimension affects angular closeness of nodes in hyperbolic graphs, but on a more elementary level it also impacts how many points can be closest to each other on D-spheres. If a finite number n > 2 of points is spread on a circle, any given point will always have two, and only two, nearest neighbors. This, however, ceases to be true on higher-dimensional spheres, where the number of nearest neighbors then depends on the number of points n. To quantify this, let us define a characteristic neighborhood B(ϕ n ) ⊂ S D (R) as an open ball (with respect to the standard great-circle distance) on the sphere, with angular radius ϕ n of the ball chosen such that The volume refers to the D-dimensional measure of the D-sphere, i.e. the circumference of the circle, the surface area of the sphere, the volume of the 3-sphere, and so on. The division of the space into areas of equal volume in Eq. (25) allows one to define the number of nearest neighbors n nn as The idea is to compute the volume of an open ball of radius 3ϕ n on S D (R) and assume that it contains 1 + n nn points, a central one and its nearest neighbors. The definition of n nn is an extension of D = 1, where B(ϕ n ) is simply an arc of length 2πR/n. In this simplest space, we trivially have where we used the cosine triple angle formula. For general dimension D, B(ϕ n ) is a hyperspherical cap of volume It follows that where ϕ n satisfies Eq. (25) using the volume of Eq. (29). When n ≫ 1, il follows that ϕ n ≪ 1. Using the approximation sin x ≈ x over the integration domains of Eqs. (29) and (30), we find the asymptotic number of nearest neighbors As depicted in Fig. 6, n nn is the same in any dimension when n is low and only a few points are spread on S D . But with higher dimensions, n nn keeps on increasing up to higher asymptotic limits given by Eq. (31). Now if, instead of counting points, n were the number of angular communities within a hyperbolic network, either real embedded networked data or a random graph model, this geometrical property of D-spheres would limit the number of communities that could be connected. The following section frame more precisely how the two effects of dimensionality we just highlighted influence community modeling within S D networks models.

Impacts on community structure
A community is a collection of nodes that are (typically) more densely connected together than to the rest of the network, which is captured in geometric networks by a fraction of nodes that are closer together in the space. In hyperbolic networks, community structure is modeled through angular aggregation of nodes on the spherical similarity space, thus creating soft communities (14). In real networks embedding on S 1 , nodes sharing qualitative attributes correlated with communities have been observed to form angular clusters. This was first observed in (1) on the internet network, then in other types of data like economic and biological networks (6,29,2). On the other hand, methods for generating modules in random hyperbolic models have also used angular closeness, either through some variant of geometric preferential attachment mechanism (14,13) or by direct sampling of clusters as angular coordinates (15).
Dimensionality has consequences on ways in which nodes can be nearby in S D , and how angular closeness is not as binding to connectivity when D > 1. We proceed to show how the previous findings impact community structure modeling in hyperbolic networks for ultra-low dimensions D ∈ {1, 2}. Numerical simulations of hyperbolic random graphs possessing community structure are at the core of the following section. Communities are generated as angular clusters and latent degrees are fixed subsequently. Once coordinates are well defined, we coarse-grain hyperbolic random graphs into block matrices that encode inter-community relations as simple weighted networks. Some global and local quantities are then measured on those block matrices to capture how communities can be related to each other in hyperbolic random graphs. This characterizes how community structure is impacted in S D at the transition between D = 1 and D = 2.

Generating communities in hyperbolic spaces
We consider the simplest possible case where all angular communities are of similar sizes and uniformly spread on the space, with latent degrees fixed subsequently to achieve a Pareto expected degree distribution in the random graph. This allows us to experiment with simple soft community structure-yet general random graphs that possess all relevant properties of hyperbolic networks.
Let n be the number of angular communities we wish to generate. Angular coordinates for each node are first sampled such that n clusters are distributed homogeneously on the space, as shown in Fig. 7A. The disparity of nodes within each angular cluster is tuned via a parameter σ ∈ [0, 1] comparable to the standard variation of normal distributions. When σ = 0, all nodes of a cluster have the same angular coordinate, whereas when σ = 1, sampling is roughly equivalent from sampling points uniformly on S D . This procedure is similar to sampling Gaussian mixtures on the circle in Ref. (15), more details are given in Supplementary Material §2, and the code is freely available online (44).
Once angular coordinates are fixed, latent degrees are optimized to obtain a Pareto expected degree distribution using the  scheme of Ref. (13 Sec. 2.1.2), with a tolerance of 0.2. We model independently the angular coordinates and the latent degrees, which is equivalent to assuming that the similarity space giving rise to some community structure is decoupled from the degrees of nodes within the graph. It follows from this assumption that the degree distribution within each angular cluster is drawn independently from the same distribution. With such angular coordinates and latent degrees, the hyperbolic random graph is fully defined by the connection probabilities of Eq. (5). Each node i = 1, . . . , N has an additional community label c i = 1, . . . , n describing its membership to one of the angular clusters, which is redefined as the closest centroid. It follows that all communities are of similar size, albeit not identical, and that there is no overlap between communities, as exemplified in Fig. 8B.
To study community structure, each random graph is mapped to a weighted graph of inter-community edges probabilities. Consider the n × n matrix where m : = i<j p ij (1 − δ(c i , c j )) is the sum of probabilities associated with inter-communities edges, which can be interpreted as the total expected number of edges between distinct communities. Matrix B's elements are normalized sums of edge probabilities between two communities, with diagonal set to zero. Thus, B can be thought of as a weighted graph describing how distinct communities interact with each other. It is normalized with the total expected number of inter-community edges such that B uv quantifies the probability of finding an edge between the corresponding pair of communities u and v. The complete procedure to sample a given random graph to obtain a block matrix B is illustrated in Fig. 7.

Global assessment of angular dependency
Anecdotally, blocks near the diagonal within matrices of Fig. 7D suggest that community structure in S 1 is impacted differently than in S 2 . Indeed, similarly generated hyperbolic random graphs in S 2 look more permissive with regards to how community blocks can be related to one another. To quantify these observations, we use the stable rank and the Shannon entropy of matrices B. Both quantities are global measures of matrix structure, related in a complementary way to how diagonal versus uniform a matrix can be.

Stable rank
The rank of a matrix is a global measure intimately related to dimensionality. In its formal definition, the rank is the maximum number of linearly independent columns or rows of a matrix, thus counting the dimension of the vector space it generates (45). When working with noisy or random matrices, it is more convenient to use the stable rank, also called numerical rank or effective rank, and defined as where s i , i = 1, . . . , n are the singular values of B in non-increasing order (46,47). The stable rank is always bounded above by the usual rank and is maximal for the identity matrix, diagonal matrices with non-zero diagonal elements, or Toeplitz matrices of the form Fig. 7. Example of sampling hyperbolic random graphs with angular community structure in S 1 and S 2 . A) Angular coordinates are used to optimize latent degrees such that B) a given expected degree distribution is achieved. C) The random graphs are then fully specified by Eq. (5), which is illustrated in matrices of connection probabilities between all nodes, ordered with polar coordinates within each community. D) Those are then coarse-grained into block matrices using Eq. (32) to study inter-community interactions in hyperbolic spaces. which makes it useful for quantifying to which extent matrix entries are concentrated near the diagonal. Moreover, the stable rank is invariant under any similarity transformation B 7 ! PBP ⊤ , where P is a permutation matrix, thus ensuring that the value of srank(B) is independent of the node labeling in the graph corresponding to B.
Since we compare different number of communities, hence block matrices of different order, we choose to work with the srank-to-dimension ratio, a version of the stable rank normalized by its maximal possible value (48). It follows that r(B) = 1 for a matrix of maximal rank, for instance a diagonal matrix, and r(B) = 0 for a null matrix. In between those extremes, r(B) captures to which extent the entries of B could be permuted to yield a diagonal matrix. In the context of community block matrices, this allows us to evaluate the complexity of connection patterns between communities. In Fig. 8A, we show that for various number of communities n and angular dispersion of nodes σ, r is always higher on S 1 than S 2 . Higher srank of the D = 1 block matrices quantifies how the inter-communities edge weights are more strictly bounded near the diagonal compared with D = 2. This difference is less notable when angular communities are few (n = 5) and highly concentrated (σ ∈ [0.1, 0.2]), since then the additional dimension has the least impact on the neighborhood of nodes because respectively, they are mostly connected within one community. In other parameter regimes, the difference between S 1 and S 2 experimentally expresses the strong angular dependence of connectivity patterns explained in Sec. "Angular distance between connected nodes," when specifically applied to community structure.

Shannon entropy
Shannon entropy (49) quantifies to which extent the probability mass function of a discrete random variable is uniform. It is often intuitively described as a measure of how uncertain the outcome of a random event is. As currently defined, matrices B describe the probability mass function of a single edge between two communities. It follows that the Shannon entropy of B matrices, quantifies how uniform block matrices are, in bits. This entropy is zero if only one entry of B has value 1, depicting a maximally nonuniform community structure. Conversely, S(B) reaches its maximal value, log 2 (n(n − 1)/2), if the matrix B is entirely uniform with B uv = 2/n(n − 1) for all n(n − 1)/2 possible community unordered pairs (u, v). Fig. 8C shows that as nodes angular dispersion increases, block matrices in S 2 have an increasing entropy. This quantifies how block matrices in D = 2 are more and more uniform as nodes scatter across the space. Conversely, block matrices in S 1 uphold the same tridiagonal structure, which is reflected in the stagnation of the entropy. Again, the difference in behavior is minimal for n = 5, but for n ∈ {15, 25} it is not. A higher Shannon entropy of block matrices in D = 2, or their uniformity, means that any given community is likely to be connected to many other communities. This effect is quantified more precisely in the following section.

Local count of neighboring communities
We also evaluate community degrees as a hyperbolic random graph equivalent to the number of nearest neighbors n nn of Sec.
"Number of nearest neighbors." Community degrees are computed as rows sum of a binary version of the community block matrix B. We binarize matrix B through the following mapping: where m is the sum of probabilities associated with intercommunities edges, as defined below Eq. (32). If the expected value of the number of edges between two communities is greater or equal than 1, that is to say we expect at least one edge between communities u and v on average in the random graph, then the two communities are related. This is a very liberal way to binarize B; other thresholds or methods could have been used. In Supplementary Material §3 and Fig. S1, we show that our numerical results for comparison of S 1 and S 2 are valid for other binarization procedures. The degree of community u is then defined as which quantifies the number of other communities u is related to. We then define the average community degree as In Fig. 9, we show the average community degree for hyperbolic random graphs with angular communities sampled according to the scheme described in Sec. "Generating communities in hyperbolic spaces." As expected, with only n = 5 angular communities, the dimension has very little impact and both models reach a value of nodes dispersion σ where all communities are related to n − 1 others. Yet, when more angular clusters are considered, 〈k〉 in S 1 barely increases, whereas in S 2 communities keep on relating to more others as nodes get more dispersed on the sphere. An upper bound on community degree in D = 1 is suggested by the purple curves of Fig. 9, although this is not the same phenomenon as the strict limits of Fig. 6. In Sec. "Number of nearest neighbors," the hyperbolic connection probability was not considered and the meaning of a neighborhood was purely geometric, with n nn varying with n. Here, we vary the angular dispersion on nodes for a given number of communities n. Yet in S 1 , the circular boundary still poses an upper limit to how many communities can be related together. As opposed to n nn = 2 in D = 1, the numerical upper bound of Fig. 9 is more than two other neighboring communities, thanks to this very permissive definition of community degree and to high degree nodes (having small radius in the hyperbolic representation) allowing for long angular range connections between communities.
This is yet another way to assess that in S 1 , most of the intercommunity edges are concentrated to fewer nearest neighbors than in D = 2, as exemplified in matrices of Fig. 8B.

Summary and future work
Recent developments in hyperbolic network geometry have challenged the common belief that only one underlying similarity dimension was enough to realistically capture the structure of real complex network (17). Although the impact of dimension on local properties like the degree distribution can be balanced by rescaling the abruptness of the connection probability through β (16,18,19), as soon as one zoom out to mesoscopic properties like clustering or short cycles, non-trivial effects of dimension arise (21,18,17). Our work adds to this line of research by highlighting the interplay between dimension and community structure. We found that tighter angular bounds for connections in the S 1 model unrealistically restricts the community structures that can be generated on hyperbolic random graphs. Yet, dimensionality can improve current modeling of community structure in the hyperbolic framework, which has already proven successful at capturing so many other important properties all at once (4,3). We found that only one additional dimension expands the inter-community connection possibilities in a way that renders more realistic modular networks modeling in hyperbolic spaces.
In the first part of this paper, we have shown that realized edges in hyperbolic random graphs are mostly near zero angular distance in D = 1, whereas it is not the case in greater dimension. This is quantified through the angular distance distribution between connected nodes given by Eq. (17), which has a mode that gets further away from 0 as D increases. Our main result is the sharp qualitative difference between the D = 1 and D > 1 cases, which is also prominent when the angular distance distribution is averaged over all nodes' latent degrees. It follows that in D > 1, connection patterns between individual nodes are less restricted by their angular proximity. Besides, the number of nearest neighbors for points on spherical manifolds is also an increasing function of the dimension. Thus, the number of nearest angular neighboring clusters of nodes, or soft communities in hyperbolic random graphs, varies with D, which reflects on community structure modeling.
To assess these effects, in the second part of this paper, we experimented numerically with hyperbolic random graphs in which nodes' angular coordinates were grouped in soft communities in D = 1 and D = 2. By averaging the individual connection probabilities into the probabilities to find an edge between two distinct communities, we obtained block matrices describing intercommunity relationships. The structure of these were studied across different values of angular dispersion of nodes to highlight differences arising from adding only one similarity dimension. Indeed, block matrices were more bounded to a diagonal shape by angular closeness in D = 1, as quantified by their stable rank, and more uniform in D = 2, as quantified by their Shannon entropy. The idea of having more uniform inter-community block matrices refers to the fact that in D > 1, any given community can be related to more other communities, as also quantified by the average community degree. Akin to the number of nearest neighbors for points, the average community degree was higher in D = 2, especially as soft communities were more dispersed angularly.
In D = 1, mechanisms underlying soft-community formation and modeling in S 1 and H 2 (14,13,51) and the inherent modularity of hyperbolic random graphs (52,53) have been studied. However, we have argued here that hyperbolic network models of varying dimensions are quite distinct when it comes to their potential to model community structure, in particular between D = 1 and D > 1. Community structures as simple as a triplet of strongly interacting communities are poorly captured in D = 1, as exemplified by the African, Asian, and European airports in Fig. 1, whereas the same overlapping communities become disentangled in D = 2, as shown in Fig. 10 and in the animation provided as Supplementary Material. The advantage of D > 1 over D = 1 for representing community network data were also reported in Ref. (51). Since most real-world networks possess some sort of nodes' aggregates, this suggests that the recently discovered higher underlying hyperbolic dimension for most real-world networks by Ref. (17) could be related to their mesoscale structure, although dimension detection is beyond the scope of our paper and should be explored in future work.
Increasing the dimensionality of the underlying hyperbolic spaces might also help to improve the likelihood maximization procedure that is used for inferring hyperbolic coordinates for networked data. As observed for three different embedding algorithms (54,50,13), inferring angular coordinates can be considered the hardest part of the hyperbolic embedding procedure. The use of common neighbors has been considered in Ref. (55) to solve this issue. We have shown that the number of nearest angular neighbors increases with dimension, as does the diversity of Note that algorithm used here is based on machine learning heuristics and therefore does not maximize the likelihood that the S 2 /H 3 model presented in Sec. "Hyperbolic networks model" generated the topology of the airports network as would standard algorithms like Mercator do (29). The inferred angular positions are therefore used here simply to illustrate the effect dimension has on the spatial distribution of communities. community structures that can be generated. Hence our conjecture that higher, albeit still ultra-low, dimensional hyperbolic spaces would more realistically capture the structure of networks and be inferred without such angular degeneracy.
Another way to see this is with a simple thought experiment. Let us assume that there exist some networks whose topology is naturally reflected by a higher underlying hyperbolic dimension. For instance, one could think of a randomly generated network in H D+1 with D > 1, or a real network like the internet which, according to Ref. (17, Fig.5) could have a dimension as high as D = 7. If such a network were to be embedded in H 2 , the geometric neighborhoods of nodes (which node is close to which others) could not be respected, since as we show in Sec. III.B. the number of nearest angular neighbors varies with dimension. Therefore, any lower-dimensional embedding will have to accommodate by positioning some nodes closer than they should and vice versa, which would reflect on tasks like link prediction, for instance. We wish we could carry out this experiment, but current embedding algorithms are yet to be generalized to more dimensions (29), or have only been used on relatively small networks (50,51) Knowledge about community structure has already been shown to impact performance of very diverse network tasks, from hyperbolic embedding coordinate inference (56)(57)(58) to dimension reduction (59)(60)(61), efficient information communication (62), and resilience (63). A natural way to push our work will then be to study how this coupling between community structure and the underlying dimension of hyperbolic networks affects tasks which have already been shown to perform better in the framework of hyperbolic network geometry. In particular, greedy routing of information propagation has been one of the first and most interesting assets of hyperbolic geometry (36,2), which should be studied in higher dimensions while explicitly considering community structure.
Another intriguing direction of research would be to use hierarchical network generation mechanisms like geometric branching growth (7) in higher dimensions. Such a procedure generates a hyperbolic network through subdivision of a small initial network into finer and finer descendants, a statistical inverse of geometric renormalization (21). By placing the initial seed network into higher-dimensional spaces, one could compare hierarchical community structure in different dimensions. This would be akin to our numerical study but in a more complex setting where the nodes angular coordinates follow a hierarchical structure. Notes a. Our notation also follows Ref. (3). b. The ̅ κ is used for the mean value to avoid confusion with expected values in random graphs. c. A connectivity or connection probability matrix is a convenient way to encode and illustrate pairwise connection probabilities. Each node corresponds to one column and one row, and the matrix entry is the edge probability between both nodes.