Steered Microaggregation as a Unified Primitive to Anonymize Data Sets and Data Streams

2019 
As data grow in quantity and complexity, data anonymization is becoming increasingly challenging. On one side, a great diversity of masking methods, synthetic data generation methods, and privacy models exists, and this diversity is often perceived as unsettling by practitioners. On the other side, most of the anonymization methodology was designed for static, structured, and small data, whereas the current landscape includes big data and, in particular, data streams. We explore here a unified and conceptually simple anonymization approach, by presenting a primitive called steered microaggregation that can be tailored to enforce various privacy models on static data sets and also on data streams. Steered microaggregation is based on adding artificial attributes that are properly initialized and weighted in order to guide the microaggregation process into meeting certain desired constraints. To demonstrate the potential of this type of microaggregation, we show how it can be used to achieve $k$ -anonymity, $t$ -closeness, $l$ -diversity, and $\epsilon $ -differential privacy in the context of static data sets; furthermore, we discuss how it can be used to achieve $k$ -anonymity of data streams while controlling tuple reordering. Beyond its flexibility and theoretical appeal, steered microaggregation can drastically reduce information loss, as shown by our experimental evaluation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    8
    Citations
    NaN
    KQI
    []