Marcus Märtens, Fernando Kuipers and Piet Van Mieghem
Follow this presentation on your device: https://coolrunning.github.io/symbreg_pres
Network Architectures and Services |
Adjacency matrix $A = \begin{bmatrix} 0 & 1 & 1 & 1 & 1 & 0 \\ 1 & 0 & 0 & 0 & 1 & 1 \\ 1 & 0 & 0 & 1 & 1 & 0 \\ 1 & 0 & 1 & 0 & 0 & 0 \\ 1 & 1 & 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 & 1 & 0 \\ \end{bmatrix}$
Laplacian matrix $Q = \begin{bmatrix} 4 & -1 & -1 & -1 & -1 & 0 \\ -1 & 3 & 0 & 0 & -1 & -1 \\ -1 & 0 & 3 & 1 & -1 & 0 \\ -1 & 0 & -1 & 2 & 0 & 0 \\ -1 & -1 & -1 & 0 & 4 & -1 \\ 0 & -1 & 0 & 0 & -1 & 2 \\ \end{bmatrix}$
In this example: $N = 6$ and $L = 9$.
How to count the number of node-disjoint triangles?
def triangles(G, A):
triangles = 0
for n1, n2, n3 in zip(G.nodes(), G.nodes(), G.nodes()):
if (A[n1,n2] and A[n2,n3] and A[n3,n1]):
triangles += 1
return triangles / 6
Algorithms need exhaustive enumeration.
In this example we have 4 triangles.
Why are we interested in triangles anyway?
$$\text{Clustering coefficient }C = \frac{3 \times \text{number of triangles}}{\text{number of connected triplets of nodes}}$$
Important metric to classify networks.
Spectral decomposition of the Adjacency Matrix:
$A = X \begin{bmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_{N} \\ \end{bmatrix} X^T $
$\lambda_1 \geq \ldots \geq \lambda_{N}$ are the eigenvalues of the network.
$\mu_1 \geq \ldots \geq \mu_{N}$ are the Laplacian eigenvalues
(We will need them later)
In this example the eigenvalues are:
$3.182, 1.247, -1.802,$
$-1.588,-0.445, -0.594$
Computing the number of triangles $\blacktriangle$ becomes simple, once you know the eigenvalues.
$\blacktriangle = \frac{1}{6} \sum\limits_{i = 1}^{N} \lambda_i^3$
Random network
Barbasi-Albert ($N$ = 5000)
Network of AS
tech-WHOIS, Mahadevan, P., Krioukov, D. et al. ($N$ = 7476)
The adjacency and the Laplacian spectra might not be as intuitive for humans as the graph representation, but both contain the complete information about the topology of the network.
Length of the longest of all shortest paths in a network.
def diameter(G):
diam = 0
for src, dest in zip(G.nodes(), G.nodes()):
path = shortest_path(src, dest)
diam = max(len(path), diam)
return diam
The exact formula for the network diameter $\rho$ implies this exhaustive enumeration:
$$\rho = \max\limits_{src, dest} \min dist(src, dest)$$
In this example the diameter is 3.
Why are we interested in the diameter anyway?
Diameter = worst-case lower bound on number of hops
i.e. limits the speed of spreading/flooding in a network
Pick the expression that minimizes the sum of absolute errors:
$f(\hat{\rho}) = \sum\limits_{G} \mid \rho(G) - \widehat{\rho(G)} \mid$
Triangles $\blacktriangle$
Diameter $\rho$
Exhaustive learning on all networks is no longer feasible.
Networks need to be sampled to create a good (diverse) learning set.
Augmented path graphs
$\rho = 11$ | |
$\rho = 10$ | |
$\rho = 8$ |
Barbell graphs
$\rho = 6$ | |
$\rho = 8$ | |
$\rho = 8$ |
Mixed graphs
50% augmented path + 50% barbell
parameter | value |
---|---|
fitness function | sum of absolute errors |
evolutionary strategy | 1+4 |
mutation type and rate | probabilistic (0.1) |
node layout | 1 row 200 columns |
levels-back | unrestricted |
number of generations | 200000 |
operators | $+,-,\times,\div,\cdot^2,\cdot^3,\sqrt{\cdot},\log$ |
constants | $1,2,3,4,5,6,7,8,9$ |
featuresets | A) $N,L,\lambda_1,\lambda_2,\lambda_3,\lambda_N$ B) $N,L,\mu_1,\mu_{N-1},\mu_{N-2},\mu_{N-3}$ |
$\begin{equation} \hat{\rho} = \frac{\log{\left (2 L \mu_{N-3} + 6 \right )} + 6}{\log{\left (L \mu_{N-3} + \sqrt{5} \sqrt{\frac{1}{\mu_{N-1}}} \right )}} + \sqrt{5} \sqrt{\frac{1}{\mu_{N-1}}} + 3 \sqrt{82} \sqrt{\frac{1}{729 L \mu_{N-2} \mu_{N-3} - 5}} \end{equation}$
$\begin{equation} \hat{\rho} = N - \frac{1 - \frac{1}{\left(L - N\right)^{\frac{3}{2}}}}{\frac{6 - \frac{6}{\left(L - N\right)^{\frac{3}{2}}}}{\sqrt{L - N}} + 4 \sqrt{L - N}} - 2 \sqrt{L - N} - \frac{1}{\sqrt{L - N}} \end{equation}$
$\begin{equation} \hat{\rho} = \sqrt{\sqrt{N} + \frac{45 \mu_{N-3}}{\left(\mu_{N-1} + \mu_{N-3}\right)^{2}} + \log{\left (\frac{216}{\left(\mu_{N-1} + \mu_{N-3}\right)^{2}} \right )} - \frac{16}{9 \mu_{N-3}} + \frac{8 \sqrt[4]{\mu_{N-3}}}{L \mu_{N-1} \mu_{N-2}}} \end{equation}$
$\begin{equation} \hat{\rho} = \frac{\log{\left (2 L \mu_{N-3} + 6 \right )} + 6}{\log{\left (L \mu_{N-3} + \sqrt{5} \sqrt{\frac{1}{\mu_{N-1}}} \right )}} + \sqrt{5} \sqrt{\frac{1}{\mu_{N-1}}} + 3 \sqrt{82} \sqrt{\frac{1}{729 L \mu_{N-2} \mu_{N-3} - 5}} \end{equation}$
Comparison to known analytical results from graph theory:
Upper bound by Chung, F.R., Faber, V. and Manteuffel, T.A.: "An Upper bound on the diameter of a graph from eigenvalues associated with its Laplacian."
(See also Van Dam, E.R., Haemers, W.H.: "Eigenvalues and the diameter of graphs.")
$\rho \leq \left\lfloor \frac{\cosh^{-1}(N-1)}{\cosh^{-1}\left(\frac{\mu_1 + \mu_{N-1}}{\mu_1 - \mu_{N-1}}\right)}\right\rfloor + 1$
parameter | value |
---|---|
$N$ | 379 |
$L$ | 914 |
true diameter $\rho$ | 17 |
approximated diameter $\hat{\rho}$ | 21.48 |
upper bound | 160.09 |
ca-netscience
source: http://www.networkrepository.com
parameter | value |
---|---|
$N$ | 4941 |
$L$ | 6594 |
true diameter $\rho$ | 46 |
approximated diameter $\hat{\rho}$ | 97.52 |
upper bound | 749.49 |
inf-power
source: http://www.networkrepository.com
parameter | value |
---|---|
$N$ | 1458 |
$L$ | 1948 |
true diameter $\rho$ | 19 |
approximated diameter $\hat{\rho}$ | 18.59 |
upper bound | 207.52 |
bio-yeast
source: http://www.networkrepository.com