Privacy-utility trade-off under continual observation

2015 
In the online setting, a user continuously releases a time-series that is correlated with his private data, to a service provider to derive some utility. Due to correlations, the continual observation of the time-series puts the user at risk of inference attacks against his private data. To protect the user's privacy, the time-series is randomized prior to its release according to a probabilistic privacy mapping. This mapping should be designed in a way that balances privacy and utility requirements over time. First, we formalize the framework for the design of utility-aware privacy mappings for time-series, under both online and batch models. We introduce two threat models, for which we respectively show that under the log-loss cost function, the information leakage can be modeled by the mutual or directed information between the randomized time-series and the private data. Second, we prove that the design of the privacy mapping can be cast as a convex optimization. We provide a sequential online scheme that allows to design privacy mappings at scale, that accounts for privacy risk from the history of released data and future releases to come. Third, we prove the equivalence of the optimal mappings under the batch and the online models, in the case of a Hidden Markov Model. Evaluations on real-world time-series data show that smart-meter data can be randomized to prevent disaggregation of per-device energy consumption, while maintaining the utility of the randomized series.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    13
    Citations
    NaN
    KQI
    []