Detecting Compiler Bugs Via a Deep Learning-Based Framework

2022 
Compiler testing is the most widely used way to assure compiler quality. However, since compilers require a large number of sophisticated test programs as inputs, the existing approaches in compiler testing still have a limited capability in generating both syntactically valid and diverse test programs. In this paper, we propose DeepGen, a deep learning-based approach to support compiler testing through the inference of a generative model for compiler inputs. First, DeepGen trains a Transformer-XL model based on a large corpus of seed programs, and uses the trained model to generate syntactically valid programs. Then, DeepGen adopts a sampling strategy in the inference phase to generate diverse test programs. Finally, DeepGen leverages differential testing on the generated programs to discover compiler bugs. We have evaluated DeepGen over two popular C++ compilers GCC and LLVM, and the results confirm the effectiveness of our approach. DeepGen detects 35.29%, 53.33%, and 187.50% more bugs than three existing approaches, i.e. DeepSmith, DeepFuzz, and Csmith, respectively. In addition, 30.43% bugs detected by DeepGen are not detected by other approaches. Furthermore, DeepGen has successfully detected 38 bugs in the latest development versions of GCC and LLVM; 21 of them have been confirmed/fixed by the developers.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []