Using K-Means Clustering Method with Doc2Vec to Understand the Twitter Users’ Opinions on COVID-19 Vaccination

2021 
In this paper, document feature representations coupled with K-Means clustering-based methods are investigated to extract the public perceptions towards the COVID-19 vaccine from the social media platform Twitter. First, Distributed Bag of Words (PV-DBOW) model was adopted to find an effective fixed-length document vector representation from all the downloaded tweets. Second, the extracted document embedding was fed into several K-means clustering-based methods for partitioning into different topics by measuring their distances to the centroids of the clusters. Different internal validation measurements were further performed to evaluate the effectiveness of the identified clustering structure and the optimal number of clusters. Our analysis identified four topics from the downloaded tweets: (1) national vaccine rollout, (2) vaccine and its relation to death, (3) vaccine approval and (4) vaccine hesitancy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []