Reducing redundant information to find simplifying patterns in complex datasets and networks is a scientific challenge in many fields of knowledge. Moreover, the detection of data dimensionality remains a difficult problem to solve. An article published in the journal Nature Communication presents a method for inferring the dimensionality of complex networks through the application of hyperbolic geometry, which captures the complexity of real-world relational structures in many diverse domains.
Among the authors of the study are researchers M. Ángeles Serrano and Marián Boguñá, from the Faculty of Physics and the Institute of Complex Systems of UB (UBICS), and Pedro Almargo, from the Superior Technical School of engineering from the University of Seville. The research study provides a multi-dimensional hyperbolic model of complex networks that replicates its connectivity, with ultra-low dimensionality and customizable for each specific network. This allows a better characterization of its structure — for example at the scale of a community — and the improvement of its predictive capacity.
The study reveals unexpected regularities, such as the extremely small dimensions of the molecular networks associated with biological tissues; the slightly higher dimensionality required by social networks and the Internet; and the discovery that brain connectomes are close to three-dimensional in their automatic organization.
Hyperbolic versus Euclidean geometry
The intrinsic geometry of datasets or complex networks is not obvious, which becomes an obstacle in determining the dimensionality of real networks. Another challenge is that the definition of distance must be established based on their relational and connectivity structure, which also requires sophisticated models.
From now on, the new approach is based on the geometry of complex networks, and more precisely, on the configurational geometric model or SD model. “This model, which we have developed in previous work, describes the structure of complex networks based on fundamental principles,” explains lecturer M. Ángeles, ICREA researcher in the Department of Condensed Matter Physics at UB.
“More precisely – he continues – the model postulates a law of interconnection of the elements (or nodes) of the network which is gravitational, therefore of the nodes which are closer in a space of similarity – of spherical geometry in D dimensions – and with more popularity – an additional dimension corresponding to the importance of the node – are more likely to establish connections. »
In the study, the similarity and popularity variables are combined to give rise to the hyperbolic geometry of the model, which appears as the natural geometry representing the hierarchical architecture of complex networks.
In previous studies, the team had applied the simplest version of the one-dimensional SD model — the S1 model – to explain many typical features of real-world networks: the small-world property (the six degrees of separation), the heterogeneous distributions of the number of neighbors per node, and the high levels of transitive relationships (triangular connections that can be illustrated by the phrase my friend’s friend is also my friend).
“Additionally, the application of statistical inference techniques allows us to obtain true network maps in the hyperbolic plane that are congruent with the established model,” she says. “Beyond visualization, these representations have been used in a multitude of tasks, including efficient navigation methods, detection of self-similarity patterns, detection of communities of strongly interacting nodes, and mapping. implementation of a network renormalization procedure that reveals hidden symmetries in the multi-scale organization of complex networks and allows the production of network replicas at reduced or enlarged scales. »
Now, the team infers the dimensionality of the hyperbolic space underlying real networks from dimension-related properties of their geometry. In particular, the work measures higher-order cycle statistics (triangles, squares, pentagons) associated with connections.
A methodology applicable to all complex networks
In computer science, applied techniques are based on data that usually defines the similarity distance between their elements, an approach that involves the construction of graphs that are mapped onto a latent space of Euclidean features.
“Our estimates of the dimensionality of complex networks are much lower than our estimates based on Euclidean space, because hyperbolic space is better suited to represent the hierarchical structure of real complex networks. For example, the Internet only requires D=7 dimensions to be mapped in the hyperbolic space of our model, whereas this name is multiplied by six and scales to D=47 in one of the newer techniques using l Euclidean space”, explains Professor Marián Boguñá.
Additionally, complex data mapping techniques usually assume a latent space, with a predetermined dimension name, or implement heuristic techniques to find an appropriate value. Thus, the new method is based on a model that does not need the spatial mapping of the network to determine the dimension of its geometry.
In the field of network science, many methodologies use shortest distances to study network connectivity structure (shortest paths) as a metric space. However, these distances are strongly affected by the small-world property and do not provide a wide range of distance values.
“Our model uses a completely different definition of distance based on an underlying hyperbolic space, and we don’t need to map the network. Our methodology is applicable to any real network or data series with a complex structure and a size which is usually several thousands or tens of thousands of nodes but can reach hundreds of thousands in a reasonable computation time”, explains Mr. Angeles Serrano.
What is the real dimensionality of social networks and the Internet?
social networks and the Internet is higher (between 6 and 9) compared to networks in other fields, according to the results of the study. However, it remains very low — 6 to 7 times lower — compared to that obtained by other methods. This reflects the fact that the interactions in these systems are more complex and determined by a greater variety of factors.
In contrast, friendship-based social networks top the dimensionality ranking. “This is an unexpected result, as one might think that friendship is a type of freer affective relationship, but our results are related to the fact that homophily in human interactions is determined by a multitude of sociological factors such as age, sex, social class, beliefs, attitudes or interests”, explains Mr. Ángeles Serrano.
In the case of the Internet, even though it is a technological network, its greater dimensionality reflects the fact that for an autonomous system, connecting does not only mean accessing the system, as one might think at first glance. on board. Rather, many different factors influence the formation of these connections and therefore a variety of other relationships may be present (e.g. supplier-customer, peer-to-peer, exchange-based peering, etc. ).
“What is really surprising, both for social networks and the Internet, is that our theoretical framework – which does not use any annotation on connections beyond their existence – is able to capture this multidimensional reality that does not ‘is not explicit in our data,’ concludes the team, which is currently working to construct hyperbolic multidimensional maps of complex networks consistent with the theoretical framework established by the SD model.