Cross-project defect prediction based on G-LSTM model

2022 
Cross-project defect prediction (CPDP) is currently a hot research direction in the field of software reliability. Traditional CPDP methods cannot capture the semantic and contextual information of programs by handcrafted features, which affects the prediction performance. In this paper, we apply technology in the NLP domain to solve it. We first extract token vectors from the abstract syntax tree (AST) of source and target code files, and then convert them into numerical vectors by the word embedding algorithm of continuous bag-of-word model (CBOW) as the input of the proposed deep learning model named Generative Adversarial Long-Short Term Memory Neural Networks (G-LSTM). The model integrates generative adversarial network (GAN) and bidirectional long-short term memory networks (BiLSTM) with attention mechanism to automatically learn semantic and contextual features of programs. Specifically, GAN is used to eliminate the differences in data distribution between source and target projects, and BiLSTM is the feature extraction encoder. We compose five projects of the PROMISE dataset into 20 source-target project pairs and conduct comparison experiments on them. The experimental results demonstrate that our method outperforms some traditional and state-of-the-art CPDP methods in terms of the evaluation metrics of AUC and Acc.To create your abstract, type over the instructions in the template box below.Fonts or abstract dimensions should not be changed or altered.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []