Authors: | |
David Cohen-Steiner | INRIA |
Weihao Kong | Stanford University |
Christian Sohler | TU Dortmund |
Gregory Valiant | Stanford University |
Introduction:
The authors study the problem of approximating the spectrum. They present a sublinear time algorithm that, given the ability to query a random node in the graph and select a random neighbor of a given node, computes a succinct representation of an approximation.
Abstract:
The spectrum of a network or graph $G=(V,E)$ with adjacency matrix A , consists of the eigenvalues of the normalized Laplacian $L= I - D^-1/2 A D^-1/2 $. This set of eigenvalues encapsulates many aspects of the structure of the graph, including the extent to which the graph posses community structures at multiple scales. We study the problem of approximating the spectrum, $łambda = (łambda_1,\dots,łambda_|V| )$, of G in the regime where the graph is too large to explicitly calculate the spectrum. We present a sublinear time algorithm that, given the ability to query a random node in the graph and select a random neighbor of a given node, computes a succinct representation of an approximation $\widetilde łambda = (\widetilde łambda_1,\dots,\widetilde łambda_|V| )$, such that $\|\widetilde łambda - łambda\|_1 łe ε |V|$. Our algorithm has query complexity and running time $exp(O(1/\eps))$, which is independent of the size of the graph, $|V|$. We demonstrate the practical viability of our algorithm on synthetically generated graphs, and on 15 different real-world graphs from the Stanford Large Network Dataset Collection, including social networks, academic collaboration graphs, and road networks. For the smallest of these graphs, we are able to validate the accuracy of our algorithm by explicitly calculating the true spectrum; for the larger graphs, such a calculation is computationally prohibitive. The spectra of these real-world networks reveal insights into the structural similarities and differences between them, illustrating the potential value of our algorithm for efficiently approximating the spectrum of large large networks.