Hierarchical Semantics Matching For Heterogeneous Spatio-temporal Sources

2021 
Spatio-temporal data are semantically valuable information used for various analytical tasks to identify spatially relevant and temporally limited correlations within a domain. The increasing availability and data acquisition from multiple sources with their typically high heterogeneity are getting more and more attention. However, these sources often lack interconnecting shared keys, making their integration a challenging problem. For example, publicly available parking data that consist of point data on parking facilities with fluctuating occupancy and static location data on parking spaces cannot be directly correlated. Both data sets describe two different aspects from distinct sources in which parking spaces and fluctuating occupancy are part of the same semantic model object. Especially for ad hoc analytical tasks on integrated models, these missing relationships cannot be handled using join operations as usual in relational databases. The reason lies in the lack of equijoin relationships, comparing for equality of strings and additional overhead in loading data up before processing. This paper addresses the optimization problem of finding suitable partners in the absence of equijoin relations for heterogeneous spatio-temporal data, applicable to ad hoc analytics. We propose a graph-based approach that achieves good recall and performance scaling via hierarchically separating the semantics along spatial, temporal, and domain-specific dimensions. We evaluate our approach using public data, showing that it is suitable for many standard join scenarios and highlighting its limitations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    0
    Citations
    NaN
    KQI
    []