A Fast Knowledge Distillation Framework for Visual Recognition
2021
While Knowledge Distillation (KD) has been recognized as a useful tool in
many visual tasks, such as supervised classification and self-supervised
representation learning, the main drawback of a vanilla KD framework is its
mechanism, which consumes the majority of the computational overhead on
forwarding through the giant teacher networks, making the entire learning
procedure inefficient and costly. ReLabel, a recently proposed solution,
suggests creating a label map for the entire image. During training, it
receives the cropped region-level label by RoI aligning on a pre-generated
entire label map, allowing for efficient supervision generation without having
to pass through the teachers many times. However, as the KD teachers are from
conventional multi-crop training, there are various mismatches between the
global label-map and region-level label in this technique, resulting in
performance deterioration. In this study, we present a Fast Knowledge
Distillation (FKD) framework that replicates the distillation training phase
and generates soft labels using the multi-crop KD approach, while training
faster than ReLabel since no post-processes such as RoI align and softmax
operations are used. When conducting multi-crop in the same image for data
loading, our FKD is even more efficient than the traditional image
classification framework. On ImageNet-1K, we obtain 79.8% with ResNet-50,
outperforming ReLabel by ~1.0% while being faster. On the self-supervised
learning task, we also show that FKD has an efficiency advantage. Our project
page: http://zhiqiangshen.com/projects/FKD/index.html, source code and models
are available at: https://github.com/szq0214/FKD.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
49
References
0
Citations
NaN
KQI