Node similarity as the basic principle behind connectivity in complex networks

by Matthias Scholz

Who is connecting to whom in social communities? This work reveals that node similarity is the fundamental mechanism that gives complex networks its typical scale-free power-law characteristics. It turns out that power-law node-degree distributions are restricted to only sparsely connected communities. More densely connected communities show an increasing divergence from power-law. The proposed similarity model covers the full observed diversity in node-degree distributions of real networks. It can be used to analyse the topological transition from weakly to strongly connected societies, as well as for investigating epidemic spreading in highly connected communities.


From lowly connected to highly connected societies

Complex networks of a highly connected society

Node-degree distributions in increasingly densely connected networks. Complex networks of different link densities show very different node-degree distributions. While sparsely connected networks (A) show the typically observed scale-free power-law like distribution, an increased density of interconnections (E) leads to distributions that are very distinct from power-law. This can be analyzed by using a similarity model in which different thresholds can be used to define a link and hence to generate networks of different densities.



Matlab code of the similarity model

Example of generating a power-law like network topology by using a similarity model based on Euclidean distance:

  % generating a random data matrix according to m=100 properties of N=8000 nodes
        X=randn(100,8000); 

  % Euclidean distance between column-vectors in X
        XX = sum(X.^2,1);
        D = XX( ones(size(XX,2),1) ,:) + XX( ones(size(XX,2),1) ,:)' - 2*X'*X;
        D = sqrt(D); 
    
  % set threshold of similarity for defining a link   
  % (for getting a power-law like distributed network: th=11 in case of a 100x8000 random matrix X)
        th = 11  

  % get the adjacency matrix    
        A= D<th;  
  

The node-degree distribution can be calculated by using the script cn_node_degree_distribution.m and then plotted in log-log scales by:

     [k,pk,nk] = cn_node_degree_distribution(A);
     loglog(k,nk,'.')
  

www.network-science.org Matthias Scholz