Efficient Creation of Japanese Tweet Emotion Dataset Using Sentence-Final Expressions

2021 
Emotion recognition in natural language text is one of the critical technologies in the human-computer interface in a wide range of fields, including health and well-being, and labeled data plays a significant role in developing such technology. This paper presents a method for efficiently collecting Japanese emotion tweets carrying the first-person's emotion using emotional expressions and sentence-final expressions. By exploiting sentence-final expressions, we can identify the targeted tweets even though the subjects of sentences are often omitted, and first-person pronouns are often not explicitly in Japanese. By applying the method to Japanese tweet data, we constructed a Japanese tweet dataset comprising 2,234 tweets with labels of emotion types and intensities for two types of emotions: joy and sadness. The evaluation results show that the proposed method can improve the collection efficiency of targeted tweets and the reliability of data labels. We developed classifiers from the dataset that recognize emotion intensities. We show that a classifier using a deep learning-based language model outperforms conventional baseline methods using a Bag of Words model and that the Japanese tweet emotion dataset constructed by our method is useful for the emotion intensity recognition.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []