Loss-based Active Learning for Named Entity Recognition

Le Thai Linh,Minh-Tien Nguyen,Guido Zuccon,Gianluca Demartini

Loss-based Active Learning for Named Entity Recognition

2021

This paper addresses the practical issue of lacking training data when building named entity recognition (NER) systems. To this aim, we introduce a new active learning method for reducing the number of training samples required by the underlying NER system. Different from prior work that only focuses on training data, we define a new loss function that when estimating loss and uncertainty scores of training samples for selection, it takes also into account the uncertainty of the $K$ unlabelled test instances most similar to the unlabelled training instances. Experimental results on both general domain and clinical benchmark datasets show that the proposed active learning method allows to train the NER system with between 5% to 7% less training data compared to state of the art uncertainty sampling methods, while retaining high NER effectiveness.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations