Man versus Machine: AutoML and Human Experts' Role in Phishing Detection.

2021 
Machine learning (ML) has developed rapidly in the past few years and has successfully been utilized for a broad range of tasks, including phishing detection. However, building an effective ML-based detection system is not a trivial task, and requires data scientists with knowledge of the relevant domain. Automated Machine Learning (AutoML) frameworks have received a lot of attention in recent years, enabling non-ML experts in building a machine learning model. This brings to an intriguing question of whether AutoML can outperform the results achieved by human data scientists. Our paper compares the performances of six well-known, state-of-the-art AutoML frameworks on ten different phishing datasets to see whether AutoML-based models can outperform manually crafted machine learning models. Our results indicate that AutoML-based models are able to outperform manually developed machine learning models in complex classification tasks, specifically in datasets where the features are not quite discriminative, and datasets with overlapping classes or relatively high degrees of non-linearity. Challenges also remain in building a real-world phishing detection system using AutoML frameworks due to the current support only on supervised classification problems, leading to the need for labeled data, and the inability to update the AutoML-based models incrementally. This indicates that experts with knowledge in the domain of phishing and cybersecurity are still essential in the loop of the phishing detection pipeline.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    50
    References
    0
    Citations
    NaN
    KQI
    []