A Fine-grained Privacy-preserving k-means Clustering Algorithm Upon Negative Databases

2019 
Nowadays, privacy protection has become an important issue in data mining. k-means algorithm is one of the most classical data mining algorithms, and it has been widely studied in the past decade. Negative database (NDB) is a new type of data representation which can protect privacy while supporting distance estimation, so it is promising to apply NDBs to privacy-preserving k-means clustering. Existing privacy-preserving k-means clustering algorithms based on NDBs could effectively protect data privacy, but their clustering performance has a non-negligible degradation. In this paper, we propose a new NDB generation algorithm (named QK-hidden algorithm), and based on this algorithm, we propose a privacy-preserving k-means algorithm. The proposed algorithm can control the accuracy of distance estimation in a fine-grained manner, and thus it can control the clustering results granularly. Experimental results demonstrate the proposed algorithm has better clustering performance than existing privacy-preserving k-means algorithms based on NDBs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    2
    Citations
    NaN
    KQI
    []