GAIDR: An Efficient Time Series Subsets Retrieval Method for Geo-Distributed Astronomical Data

2018 
Time series subsets retrieval is an important problem in astronomical image data analysis, and the original astronomical files are usually stored in geo-distributed data centers. Current data retrieval approach consists of three manual steps, i.e., retrieving original astronomical files, ordering filtered files and obtaining target sub-files. Besides, astronomers need to download or copy entire datasets to their local machines from different data centers before the first step. The retrieval method can only work for a small amount of data. However, astronomical images are big data, indicating that it is time-consuming to process the massive geo-distributed data and requires a significant amount of manual work. In this paper, we propose Geo-Distributed Astronomy Image Data Retrieval (GAIDR), an automatic and efficient subsets retrieval method for astronomical time series data in a geo-distributed environment, and implement a data retrieval system based on it. The retrieval system automatically inquires and returns the target astronomical sub-files by using indexes of astronomical files, including original files index and replica index. Moreover, to further accelerate the retrieval of astronomical time series subsets, a replica strategy is designed in GAIDR method. The replica strategy can be utilized to reduce the size of replica files and merge the related files via replica coordinate-mapping algorithm. It recognizes the hot replica files on the basis of history requests and optimizes replica files data layout according to the temporal and spatial characteristic of data access in time domain astronomy research. Experimental results show that GAIDR method can achieve the highest replica hit ratio among all test methods, and the average response time can be reduced by at least 14.07% comparing with other methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    1
    Citations
    NaN
    KQI
    []