Hindcasting violent events in Colombia using Internet data

2021 
Colombia experienced a decades-long civil war between the government and many left-wing guerrilla groups. It was marked by violence, kidnappings, and large quantities of human displacement. Monitoring and forecasting civil wars are important to mitigate their potential impact but require access to ground truth data. We examine the use of Internet data streams, namely Google search queries, tweets related to politics, and traditional news sources to retrospectively forecast (i.e., hindcast) state-based armed violence in Colombia. We compare the results of statistical models using three combinations of these features to evaluate the predictive capabilities of each data stream. Our results show that the combination of internet and traditional news data models perform most consistently, though Internet-only is surprisingly promising. Overall, we are able to produce high-quality models hindcasting the presence or absence of state-based armed violence in Colombia up to 6 months in advance. These results support the use of exogenous data streams to forecast evolving situations around the globe.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []