Efficient Range Distribution Query for Visualizing Scientific Data

2014 
Visualization applications implicitly run queries on the data to retrieve distributions and statistical measures derivable from distributions. Distribution based data summaries can substitute for the raw data to answer statistical queries of different kinds. However, frequent access to the raw data is no longer practical, if possible at all, for answering large number of queries on large-scale data. Our work addresses the issue by accelerating range distribution query, which returns the distribution of an axis-aligned query region. Maintaining the interactivity of such query is a challenging task because the workload and the response time of such queries scale up with the data and the query size. In this paper, we present a framework for answering range distribution queries for any arbitrary region in near constant time, regardless of data and query size. We adapt an integral histogram based data structure to bound the workload which is a combination of computation, I/O and communication cost. We propose two novel transformations of this data structure -- a decomposition and a similarity-driven indexing -- to reduce the huge storage cost associated with it. In addition to studying the performance of range distribution query, we also demonstrate the benefits that our technique offers to visualization applications which directly or indirectly require distributions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    17
    Citations
    NaN
    KQI
    []