K-Means Parallel Algorithm of Big Data Clustering Based on Mapreduce PCAM Method

2021 
With the increase in the offshore industry in the Beibu Gulf, data clustering has become an important task of intelligent ocean monitoring. However, the traditional K-means algorithm is not suitable for large-scale marine data. Aiming at the characteristics of marine big data, a parallel K-means algorithm based on MapReduce big data clustering is proposed. First, according to the characteristics of the MapReduce framework, a partition, communication, combination and mapping model is established. A parallel K-means algorithm based on MapReduce big data clustering is then designed, and the execution process of the algorithm is analyzed. Finally, through data and experimental analysis, it is demonstrated that the MR K-means parallel algorithm reduces the time and space complexity and the data point missing rate compared with the traditional algorithm.Keywords: Clustering, K-means, Parallel, MapReduce, PCAMCite AsY. Li, Z. Yang, K. Han, "K-Means Parallel Algorithm of Big Data Clustering Based on Mapreduce PCAM Method", Engineering Intelligent Systems, vol. 29 no. 6, pp. 411-418, 2021.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []