UTANSA: Static Approach for Multi-Language Malicious Web Scripts Detection

2021 
In order to detect malicious web scripts automatically, many detection methods using static features and machine learning are proposed. However, the existing detection methods can only detect web scripts of specific programming languages. This paper proposes the unified text features and abstract syntax tree(AST) node sequence features algorithm(UTANSA) that exploits the text feature classification method and AST node classification method, together with the corresponding unified method to enhance the generalization ability of the model. Through the algorithm, two unified approaches are proposed based on text features and AST node features respectively, so that the detection model can detect multi-language web scripts. We choose scripts written in the JavaScript(JS) and PHP languages for experimentation to evaluate our approach. The results show that the detection model trained with the proposed method has a similar detection effect as trained with only JS samples or PHP samples.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []