HASTE: A Distributed System for Hybrid and Adaptive Processing on Streaming Spatial-Textual Data

2021 
Streaming spatial-textual data that contains geographic and textual information, e.g., geo-tagged tweets, has an unprecedented increase in amount. As one of the basic operations, the continuous spatial-textual queries that retrieve real-time results continuously on large-scale spatial-textual streams call for means of efficient distributed processing. However, existing proposals either are spatialaware only, or superficially exploit textual information for pruning. We propose a distributed system, called HASTE, for hybrid and adaptive processing on streaming spatial-textual data. The novelty lies on three aspects: (1) We propose a novel method to reduce the workload beforehand by dividing objects and queries into mutually exclusive types; (2) We develop a novel load partitioning strategy and a novel cost model that consider both spatial and textual properties; (3) We design a multi-level load adjustment strategy that adaptively copes with different degrees of load imbalance. We report on extensive experiments with real-world data that offer insight into the performance of the solution, and show that the solution is capable of outperforming the state-of-the-art proposals.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []