A Distributed Interactive Cube Exploration System

2013 
As the amount of data generated from web, mobile and social media increases rapidly, analytics using OLAP (Online Analytical Processing) data cubes is getting increasingly popular among organizations. In a typical scenario, this analysis is performed using BI tools to quickly get insights from the pre-materialized multi-dimensional aggregated data. We introduce DICE, a distributed interactive system that uses a novel session-oriented model for online data cube exploration, which is designed to provide the user with interactive sub-second latencies for specified accuracy levels. We provide a novel framework that combines three concepts: faceted exploration of data cubes , speculative execution of queries and query execution over sampled data . Our system uses a combination of intuitive frontend for faceted cube exploration and distributed query execution backend that guarantees interactive latencies. We catalog the challenges encountered in building such a system, we discuss design considerations, implementation details and optimizations of our system. Experiments demonstrate that cube exploration using DICE at billion-tuple scale is at least 33% faster than current approaches. As shown in our video demonstration, DICE allows the user to fluidly interact with billion-tuple datasets while maintaining sub-second interactive response times.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []