Twitter as research data: Tools, costs, skillsets, and lessons learned

2021 
Scholars increasingly use Twitter data to study life sciences and politics. However, Twitter data collection tools often pose challenges for scholars who are unfamiliar with their operation. Equally important, although many tools indicate they offer representative samples of the full Twitter archive, little is known about whether the samples are indeed representative of the targeted population of tweets. This article evaluates such tools in terms of costs, training, and data quality as a means to introduce Twitter data as a research tool. Further, using an analysis of COVID-19 and Moral Foundations Theory as an example, we compared the distributions of moral discussions from two commonly-used tools for accessing Twitter data (i.e., Twitter's standard APIs and third-party access) to the ground truth, the Twitter full archive. Our results highlight the importance of assessing the comparability of different data sources to improve confidence in findings based on Twitter data. We also brief the major new features about Twitter's move to API version 2. © 2021 Lippincott Williams and Wilkins. All rights reserved.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    67
    References
    0
    Citations
    NaN
    KQI
    []