Improving Software Bug-Specific Named Entity Recognition with Deep Neural Network

2020 
Abstract There is a large volume of bug data in the bug repository, which contains rich bug information. Existing studies on bug data mining mainly rely on using information retrieval (IR) technology to search relevant historical bug reports. These studies basically treat a bug report as a closed unit, ignoring the semantic and structural information within it. Named-entity recognition (NER) is an important task of information extraction (IE) technology. Based on NER, fine-grained factual information could be comprehensively extracted to further form structured data, which provides a new way to improve the accessibility of bug information. However, bug NER is different from general NER tasks. Bug reports are free-form text, which include a mixed language environment studded with code, abbreviations and software-specific vocabularies. In this paper, we propose a deep neural network approach for bug-specific entity recognition called DBNER using bidirectional long short-term memory (LSTM) with Conditional Random Fields decoding model (CRF). DBNER extracts multiple features from the massive bug data and uses attention mechanism to improve the consistency of entity tags in the bug reports. Experiment results show that the F1-score reaches an average of 91.19%. In addition, in cross-project experiments, the DBNER’s F1-score reaches an average of 84%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    63
    References
    8
    Citations
    NaN
    KQI
    []