Distributed and interactive cube exploration

2014 
Interactive ad-hoc analytics over large datasets has become an increasingly popular use case. We detail the challenges encountered when building a distributed system that allows the interactive exploration of a data cube. We introduce DICE, a distributed system that uses a novel session-oriented model for data cube exploration, designed to provide the user with interactive sub-second latencies for specified accuracy levels. A novel framework is provided that combines three concepts: faceted exploration of data cubes, speculative execution of queries and query execution over subsets of data. We discuss design considerations, implementation details and optimizations of our system. Experiments demonstrate that DICE provides a sub-second interactive cube exploration experience at the billion-tuple scale that is at least 33% faster than current approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    61
    References
    119
    Citations
    NaN
    KQI
    []