D2K: Scalable Community Detection In Massive Networks Via Small-Diameter K-Plexes

Authors:
Alessio Conte National Institute of Informatics, Japan
Tiziano De Matteis University of Pisa
Daniele De Sensi University of Pisa
Roberto Grossi Universita' di Pisa
Andrea Marino Universita' di Pisa
Luca Versari Universita' di Pisa

Introduction:

This paper studies k-plexes, a well known pseudo-clique model for network communities.The paper's goal is to detect large communities in today’s real-world graphs. The authors present D2K, which is the first algorithm able to find large k-plexes of very large graphs in just a few minutes.

Abstract:

This paper studies k-plexes, a well known pseudo-clique model for network communities. In a k-plex, each node can miss at most k-1 links. Our goal is to detect large communities in today’s real-world graphs which can have hundreds of millions of edges. While many have tried, this task has been elusive so far due to its computationally challenging nature: k-plexes and other pseudo-cliques are harder to find and more numerous than cliques, a well known hard problem. We present D2K, which is the first algorithm able to find large k-plexes of very large graphs in just a few minutes. The good performance of our algorithm follows from a combination of graph-theoretical concepts, careful algorithm engineering and a high-performance implementation. In particular, we exploit the low degeneracy of real-world graphs, and the fact that large enough k-plexes have diameter 2. We validate a sequential and a parallel/distributed implementation of D2K on real graphs with up to half a billion edges.

You may want to know: