XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection.

Emily Öhman,Marc Pàmies,Kaisla Kajava,Jörg Tiedemann

XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection.

2020

Emily Öhman
Marc Pàmies
Kaisla Kajava
Jörg Tiedemann

We introduce XED, a multilingual fine-grained emotion dataset. The dataset consists of human-annotated Finnish (25k) and English sentences (30k), as well as projected annotations for 30 additional languages, providing new resources for many low-resource languages. We use Plutchik's core emotions to annotate the dataset with the addition of neutral to create a multilabel multiclass dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to show that XED performs on par with other similar datasets and is therefore a useful tool for sentiment analysis and emotion detection.

Keywords:

Natural language processing
Support vector machine
Artificial intelligence
emotion detection
Sentiment analysis
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations