Forecasting the Chilean Electoral Year: Using Twitter to Predict the Presidential Elections of 2017

2018 
Failures of traditional survey methods for measuring political climate and forecasting high impact events such as elections, offers opportunities to seek alternative methods. The analysis of social networks with computational linguistic methods have been proved to be useful as an alternative, but several studies related to these areas were conducted after the event (post hoc). Since 2017 was the election year for the 2018–2022 period for Chile and, moreover, there were three instances of elections in this year. This condition makes a good environment to conduct a case study for forecasting these elections with the use of social media as the main source of Data. This paper describes the implementation of multiple algorithms of supervised machine learning to do political sentiment analysis to predict the outcome of each election with Twitter data. These algorithms are Decision Trees, AdaBoost, Random Forest, Linear Support Vector Machines and ensemble voting classifiers. Manual annotations of a training set are conducted by experts to label pragmatic sentiment over the tweets mentioning an account or the name of a candidate to train the algorithms. Then a predictive set is collected days before the election and an automatic classification is performed. Finally the distribution of votes for each candidate is obtained from this classified set on the positive sentiment of the tweets. Ultimately, an accurate prediction was achieved using an ensemble voting classifier with a Mean Absolute Error of \(0.51\%\) for the second round.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    34
    References
    7
    Citations
    NaN
    KQI
    []