ASAP: Fast, Approximate Pattern Mining At Scale

Authors:
Anand Padmanabha Iyer UC Berkeley
Zaoxing Liu Johns Hopkins University
Xin Jin Johns Hopkins University
Shivaram Venkataraman Microsoft Research / University of Wisconsin
Vladimir Braverman Johns Hopkins University
Ion Stoica UC Berkeley

Introduction:

This paper presents ASAP, a fast, approximate computation engine for graph pattern mining.To enable the users to navigate the trade-off between the result accuracy and latency, the authors propose a novel approach to build the Error-Latency Profile (ELP) for a given computation. This paper presents ASAP, a fast, approximate computation engine for graph pattern mining.To enable the users to navigate the trade-off between the result accuracy and latency, we propose a novel approach to build the Error-Latency Profile (ELP) for a given computation.

Abstract:

While there has been a tremendous interest in processing data that has an underlying graph structure, existing distributed graph processing systems take several minutes or even hours to mine simple patterns on graphs. This paper presents ASAP, a fast, approximate computation engine for graph pattern mining. ASAP leverages state-of-the-art results in graph approximation theory, and extends it to general graph patterns in distributed settings. To enable the users to navigate the trade-off between the result accuracy and latency, we propose a novel approach to build the Error-Latency Profile (ELP) for a given computation. We have implemented ASAP on a general-purpose distributed dataflow platform, and evaluated it extensively on several graph patterns. Our experimental results show that ASAP outperforms existing exact pattern mining solutions by up to 77×. Further, ASAP can scale to graphs with billions of edges without the need for large clusters.

You may want to know: