Hate Speech Detection in Code-switched Text Messages
2019
Not only does it happen in America, but also in Asia, in Africa and all over the world: Hate Speech. The exponential growth of user-generated content on social media bordering hate speech is increasingly alarming. Several efforts to monitor this phenomenon by social media network companies and the research community are on-going with various degrees of success. One gap in previous studies that this study addresses is the identification of hate speech in codeswitched text messages. The alternation of words in different languages within a message is a common occurrence among multilingual persons or communities. The study explored the performance of different features across various machine learning algorithms and established that character-level Term Frequency-Inverse Document Frequency, performed best given a codeswitched dataset of 25k annotated tweets using support vector machine algorithm as compared to six other conventional and two deep learning algorithms.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
29
References
3
Citations
NaN
KQI