Refining Traceability Links between Code and Software Documents

2017 
Recovering traceability links between source code and software document can be very helpful for Software Maintenance and Software Reuse. Existing work has already achieved good results in extracting code elements (classes, methods, etc.) from software documents. However, it will lead to a lot of noise links if we link a document to all the code elements existing in it. In this paper, we propose an approach to identify the contextual code elements and the salient code elements in a software document, then we can weight the traceability links between source code and software document so that those noise traceability links can be filtered effectively. We measure the saliency of each code element in a document with four kinds of document-related features and three kinds of code-related features, and we adopt TransR-based code embedding technology to evaluate the distance between code elements. In the experiments, we get a precision of 70.7% in recognizing salient code elements of StackOverflow answer documents, which is more than 12% improvement compared with Rigby's work. At the same time, we can filter about 56.5%~69.3% noise traceability links compared with the RecoDoc approach. It will improve the quality of traceability links between source code and related software documents.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    3
    Citations
    NaN
    KQI
    []