Software ingredients: detection of third-party component reuse in Java software release

2016 
A software product is often dependent on a large number of third-party components.To assess potential risks, such as security vulnerabilities and license violations, a list of components and their versions in a product is important for release engineers and security analysts.Since such a list is not always available, a code comparison technique named Software Bertillonage has been proposed to test whether a product likely includes a copy of a particular component or not.Although the technique can extract candidates of reused components, a user still has to manually identify the original components among the candidates.In this paper, we propose a method to automatically select the most likely origin of components reused in a product, based on an assumption that a product tends to include an entire copy of a component rather than a partial copy.More concretely, given a Java product and a repository of jar files of existing components, our method selects jar files that can provide Java classes to the product in a greedy manner.To compare the method with the existing technique, we have conducted an evaluation using randomly created jar files including up to 1,000 components.The Software Bertillonage technique reports many candidates; the precision and recall are 0.357 and 0.993, respectively.Our method reports a list of original components whose precision and recall are 0.998 and 0.997.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    14
    Citations
    NaN
    KQI
    []