Word sense disambiguation by learning from unlabeled data

Seong-Bae Park,Byoung-Tak Zhang,Yung Taek Kim

Word sense disambiguation by learning from unlabeled data

2000

Seong-Bae Park
Byoung-Tak Zhang
Yung Taek Kim

Most corpus-based approaches to natural language processing suffer from lack of training data. This is because acquiring a large number of labeled data is expensive. This paper describes a learning method that exploits unlabeled data to tackle data sparseness problem. The method uses committee learning to predict the labels of unlabeled data that augment the existing training data. Our experiments on word sense disambiguation show that predictive accuracy is significantly improved by using additional unlabeled data.

Keywords:

Semi-supervised learning
Word-sense disambiguation
Training set
Artificial intelligence
Machine learning
Natural language processing
Computer science
Labeled data
Unsupervised learning
Augment
SemEval
Pattern recognition
learning methods
Exploit

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations