Towards Privacy-Preserving Domain Adaptation

2020 
This study suggests a new domain adaptation paradigm that can address potential data-privacy issues. Despite promising results of existing domain adaptation methods, they have a strong constraint where the source and target samples are accessible during a training phase. However, direct usage of source samples possibly causes data-privacy issues especially when each label of source domain acts as an individual's identifier such as biometric information. To address data-privacy problems in conventional domain adaptation, we propose privacy-preserving domain adaptation (PPDA). Our main hypothesis is that if we train our target model initialized from a pre-trained source model in a self-learning manner, we can successfully transfer knowledge from a labeled source domain to an unlabeled target domain. In our preliminary study, we observe that target samples with low self-entropy measured from the pre-trained source model achieves sufficiently high accuracy. From this key observation, we first select the reliable samples based on self-entropy and define them as class prototypes. We then assign pseudo labels to the target samples through the similarity between target samples and class prototypes. To further reduce the uncertainty of the pseudo labeling process, we also introduce a sample-level reweighting scheme. Surprisingly, our PPDA model outperforms conventional domain adaptation methods on public datasets even though we do not directly access any source data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    41
    References
    2
    Citations
    NaN
    KQI
    []