Differential Privacy for Data and Model Publishing of Medical Data

2019 
Combining medical data and machine learning has fully utilized the value of medical data. However, medical data contain a large amount of sensitive information, and the inappropriate handling of data can lead to the leakage of personal privacy. Thus, both publishing data and training data in machine learning may reveal the privacy of patients. To address the above issue, we propose two effective approaches. One combines a differential privacy and decision tree (DPDT) approach to provide strong privacy guarantees for publishing data, which establishes a weight calculation system based on the classification and regression tree (CART) method and takes weights as a new element of differential privacy to participate in privacy protection and reduce the negative impact of differential privacy on data availability. Another uses the differentially private mini-batch gradient descent algorithm (DPMB) to provide strong protection for training data; it tracks the privacy loss and allows the model to satisfy differential privacy in the process of gradient descent to prevent attackers from invading personal privacy with the training data. It is worth mentioning that, in this paper, we adopt the data processed by DPDT as the training data of DPMB to further strengthen the privacy of data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    12
    Citations
    NaN
    KQI
    []