An Explainable Multi-Modal Hierarchical Attention Model for Developing Phishing Threat Intelligence

2021 
Phishing website attack, as one of the most persistent forms of cyber threats, evolves and remains a major cyber threat. Various detection methods (e.g., lookup systems, fraud cue-based methods) have been proposed to identify phishing websites. The limitations of lookup systems (e.g., failing to address newly created attacks) and the fraud cue-based methods (e.g., relying on feature engineering) motivated the development of deep representation-based methods capable of learning deep fraud cues for enhanced anti-phishing capacity. Focusing mostly on URLs, these methods fail to analyze other two important modalities of website content: textual information and visual design. Moreover, the interpretability of these deep learning based methods is limited, reducing model trustworthiness and preventing relevant and actionable intelligence. We propose a multi-modal hierarchical attention model (MMHAM) which jointly learns the deep fraud cues from the three major modalities of website content for phishing website detection. Specifically, MMHAM features an innovative shared dictionary learning approach for aligning representations from different modalities in the attention mechanism. In evaluation experiments, the proposed MMHAM not only learned improved deep cues for enhanced phishing detection, but provided a hierarchical interpretability system from which we could develop phishing threat intelligence to inform phishing websites detection at different levels.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []