Malware Detection Method based on Control Flow Analysis

2019 
With the rapid development of computer technology and the iteration of malware, numerous computer malware detection methods have been proposed, among which malware detection based on machine learning has become an important research direction. Opcode sequences are often used as an important feature of machine learning models for training and testing. However, opcode sequences extracted orderly from raw disassembly text may not adequately reflect the behavior of executables. In order to solve this problem, this paper proposes a new malware detection method based on the control flow of executables. This method analyzes the control flow of executables and extracts execution traces from by the unit of function. This paper uses the execution traces to represent the behavior of the executables and present the execution traces by opcode sequences. In order to improve the training efficiency and reduce the feature dimensionality, this paper propose an x86 intermediate code. Then we convert the opcode sequences into the corresponding intermediate code sequences and vectorize them using Vector Space Model (VSM) for training and testing. This paper implements Naive Bayes Classifier, Support Vector Machines and Random Forest for classification. The experiment result shows that the Random Forest algorithm has the best performance, with an accuracy of 90.8% and the model establishment time is 22s. Compared with the traditional methods based on disassembly text, the accuracy is improved by 1.4%, compared with the methods using x86 opcode, the efficiency is increased by 62%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    1
    Citations
    NaN
    KQI
    []