A Study of C/C++ Code Weaknesses on Stack Overflow

2021 
Stack Overflow hosts millions of solutions that aim to solve developers' programming issues. Stack Overflow becomes a code hosting website where developers actively share its code. However, code snippets on Stack Overflow may contain security vulnerabilities, and if shared carelessly, such snippets can introduce security problems in software systems. In this paper, we empirically study the prevalence of the Common Weakness Enumeration -- CWE, in code snippets of C/C++ related answers. We explore the characteristics of Codew, i.e., code snippets that have CWE instances, in terms of the types of weaknesses, the evolution of Codew, and who contributed such code snippets. We find that: 1) 36% (i.e., 32 out of 89) CWE types occurred in Codew on Stack Overflow. Particularly, CWE-119, i.e., improper restriction of operations within the bounds of a memory buffer, is common in both answer code snippets and real-world software systems. Furthermore, the proportion of Codew, doubled from 2008 to 2018 after normalizing by the total number of C/C++ snippets in each year. 2) In general, code revisions are associated with a reduction in the number of code weaknesses. However, the majority of Codew had weaknesses introduced in the first version of the code, and these Codew were never revised since then. Only 7.5% of users who contributed C/C++ code snippets posted or edited code with weaknesses. Users contributed fewer code with CWE weakness when they were more active -- either revised more code snippets or had a higher reputation. We also find that some users tended to have the same CWE type repeatedly in their various code snippets. Our empirical study provides insights to users who share code snippets on Stack Overflow so that they are aware of the potential security issues. To understand the community feedback about improving code weaknesses by answer revisions, we also conduct a pilot user study and 62.5% of our suggested revisions are adopted by the community. Stack Overflow can perform CWE scanning for all the code that is hosted on its platform. Further research is needed to improve the quality of the crowdsourced knowledge on Stack Overflow.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []