An improved anonymity model for big data security based on clustering algorithm

2017 
Summary The accumulation of massive data generates the new concept of big data. The relationships hidden in big data can bring great benefits, which have attracted public attentions. Meanwhile, the challenges of big data security are also more serious than ever. Privacy disclosure is one of the most concerned problems, and the privacy protection of big data is more difficult than traditional information protection. The technology of data publishing anonymous protection can provide privacy protection with the respect of data releasing. K-anonymity and L-diversity are two kinds of anonymity model. Their main idea is to generalize the value of quasi-identifier and make the data accord with the model. In this paper, we propose the improved model which integrate K-anonymity with L-diversity and can solve the problem of imbalanced sensitive attribute distribution. K-member clustering algorithm can translate the problem of anonymity into the problem of clustering and find a set of equivalence classes in which the records will be generalized to the same value. We utilize K-member clustering algorithm to realize the improved anonymity model which can reduce the algorithm execution time and information loss. The integration of anonymity model and clustering algorithm makes the generalization process more efficient, which is particularly important for big data. Copyright © 2016 John Wiley & Sons, Ltd.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    27
    Citations
    NaN
    KQI
    []