Understanding Multilingual Communities through Analysis of Code-switching Behaviors in Social Media Discussions

2019 
Currently, the enormous span of social media usage – while providing valuable resources for linguistic behavior analysis – makes tracking and understanding these multilingual discussions a challenging task. We have undertaken a multidisciplinary comprehensive study of multilingual discussions via the development of specialized data collection techniques that discover and track multilingual users of social media, and their associated discussions, within a defined geographical region. To facilitate automatic discussion analysis of large numbers of discussions we generated a machine learning model based on ground truth data obtained from Amazon Turk. Our approach goes beyond analyzing social media posts in isolation, by analyzing them in the context of the discussion in which they appear. We show a selection of example discussions found using our approach which reveals a number of interesting socio-linguistic interactions in the communities that we sampled, in support of approach as a general methodology for multilingual community analysis.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []