Disatra: A Real-Time Distributed Abstract Trajectory Clustering

2021 
Trajectory clustering is regarded as the building block of many applications in trajectory data mining. Nowadays, the ubiquity of positioning devices generates massive trajectory data continuously, which enables various real-time applications, like traffic congestion analysis, accident detection, and traveling group detection. However, these applications rely on an efficient distributed algorithm to cluster trajectory streams in real time. In this paper, we propose a real-time distributed trajectory clustering algorithm to solve this problem. The algorithm starts with a trajectory abstraction process, which compresses trajectories of arbitrary lengths into uniform data structures, named as abstract trajectories, to address the data skewness. Then, a Geohash-based indexing strategy is proposed to partition the abstract trajectories so that clustering can be performed locally with no cross-node interaction. Finally, we design a density-based clustering algorithm on abstract trajectory which achieves a similar accuracy compared with existing clustering methods applied on original trajectories, but with much higher efficiency. Extensive experiments conducted on a real-world dataset show that our approach generates similar clustering results with significantly higher throughput and lower latency, which enable the online clustering.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []