Achieving Optimal K-Anonymity Parameters for Big Data
2018
Datasets containing private and sensitive information are useful for data analytics. Data owners cautiously release such sensitive data using privacy-preserving publishing techniques. Personal re-identification possibility is much larger than ever before. For instance, social media has dramatically increased the exposure to privacy violation. One well-known technique of k-anonymity proposes a protection approach against privacy exposure. K-anonymity tends to find k equivalent number of data records. The chosen attributes are known as Quasi-identifiers. This approach may reduce the personal re-identification. However, this may lessen the usefulness of information gained. The value of k should be carefully determined, to compromise both security and information gained. Unfortunately, there is no any standard procedure to define the value of k. The problem of the optimal k-anonymization is NP-hard. In this paper, we propose a greedy-based heuristic approach that provides an optimal value for k. The approach evaluates the empirical risk concerning our Sensitivity-Based Anonymization method. Our approach is derived from the fine-grained access and business role anonymization for big data, which forms our framework.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI