Too Much Data? Opportunities and Challenges of Large Datasets and Cybercrime

2021 
Never before have criminologists had such rich data about the communications of a wide variety of individuals involved at various stages of crime. We now have records of discussions held between cybercrime offenders going back 20 years. Indeed, given we now have over 70 million posts by almost two million users, we are encountering a different type of problem: we have too much data. Although the datasets potentially allow us to answer questions we never before thought were possible, we also face unique challenges such as categorization of large datasets and temporal shifts in users, topics, ideas, and ways of communications. One answer to this problem may lie in automation: using machine learning to classify and label posts and interactions at scale. In this chapter, we will outline some of the opportunities and challenges associated with using such large datasets, some of the ways we are currently addressing these challenges, and potential ways forward.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []