Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Wenxuan Zhou,Muhao Chen

Contrastive Out-of-Distribution Detection for Pretrained Transformers.

2021

Wenxuan Zhou
Muhao Chen

Pretrained transformers achieve remarkable performance when the test data follows the same distribution as the training data. However, in real-world NLU tasks, the model often faces out-of-distribution (OoD) instances. Such instances can cause the severe semantic shift problem to inference, hence they are supposed to be identified and rejected by the model. In this paper, we study the OoD detection problem for pretrained transformers using only in-distribution data in training. We observe that such instances can be found using the Mahalanobis distance in the penultimate layer. We further propose a contrastive loss that improves the compactness of representations, such that OoD instances can be better differentiated from in-distribution ones. Experiments on the GLUE benchmark demonstrate the effectiveness of the proposed methods.

Keywords:

Distribution (mathematics)
Test data
Mahalanobis distance
Benchmark (computing)
Pattern recognition
Inference
transformer
Training set
Layer (object-oriented design)
Computer science
Artificial intelligence

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations