On compressing massive streaming graphs with Quadtrees

2015 
Social networks are constantly changing as new members join, existing members leave, and ‘followers’ or ‘friends’ are formed and disappear. The model that captures this constantly changing graph is the streaming graph model. Given a massive graph data stream wherein the number of nodes is in the order of millions and the number of edges is the tens of millions, we propose a simple algorithm to compress this graph without having read in the entire graph into the main memory. Our algorithm uses the quadtree data structure that is implicitly constructed to produce the compressed graph output. As a result of this implicit construction, our algorithm allows for node and edge additions/deletions that directly modifies the output compressed graph. We further develop algorithms to solve edge queries (is there any between two nodes?) and node queries (for a given node, list all its neighbors) that directly operates on the compressed graph. We have performed extensive empirical evaluations of our algorithms using publicly available, large social networks such as LiveJournal, Pokec, Twitter, and others. Our empirical evaluation is based on several parameters including time to compress, memory required by the compression algorithm, size of compressed graph, and time and memory size required to execute queries. We have also presented extensions to the compression algorithm that we have developed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    13
    Citations
    NaN
    KQI
    []