Pareto-optimal Community Search on Large Bipartite Graphs

In many real-world applications, bipartite graphs are naturally used to model relationships between two types of entities. Community discovery over bipartite graphs is a fundamental problem and has attracted much attention recently. However, all existing studies overlook the weight (e.g., influence or importance) of vertices in forming the community, thus missing useful properties of the community. In this paper, we propose a novel cohesive subgraph model named Pareto-optimal (α β), which is the first to consider both structure cohesiveness and weight of vertices on bipartite graphs. The proposed Pareto-optimal (α β) model follows the concept of (α, β)-core by imposing degree constraints for each type of vertices, and integrates the Pareto-optimality in modelling the weight information from two different types of vertices. An online query algorithm is developed to retrieve Pareto-optimal (α β) with the time complexity of O(p. m) where p is the number of resulting communities, and m is the number of edges in the bipartite graph G. To support efficient query processing over large graphs, we also develop index-based approaches. A complete index i is proposed, and the query algorithm based on i achieves linear query processing time regarding the result size (i.e., the algorithm is optimal). Nevertheless, the index i incurs prohibitively expensive space complexity. To strike a balance between query efficiency and space complexity, a space-efficient compact index 𝕀 is proposed. Computation-sharing strategies are devised to improve the efficiency of the index construction process for the index 𝕀. Extensive experiments on 9 real-world graphs validate both the effectiveness and the efficiency of our query processing algorithms and indexing techniques.
    • Correction
    • Source
    • Cite
    • Save