Interactive Knowledge Distillation for Image Classification

2021 
Abstract Knowledge distillation (KD) is a standard teacher-student learning framework to train a light-weight student network under the guidance of a well-trained, large teacher network. As an effective teaching strategy, interactive teaching has been widely employed at school to motivate students, in which teachers not only provide knowledge, but also give constructive feedback to students upon their responses, to improving their learning performance. In this work, we propose Interactive Knowledge Distillation (IAKD) to leverage the interactive teaching strategy for efficient knowledge distillation. In the distillation process, the interaction between the teacher network and the student one is implemented by swapping-in operations: randomly replacing the blocks in the student network with the corresponding blocks in the teacher network. In this way, we directly involve the teacher’s powerful feature transformation ability for largely boosting the performance of the student network. Experiments with typical settings of teacher-student networks demonstrate that the student networks trained by our IAKD achieve better performance than those trained by conventional knowledge distillation methods on diverse image classification datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    0
    Citations
    NaN
    KQI
    []