Do student programmers all tend to write the same software tests

Stephen H. Edwards,Zalia Shams

Do student programmers all tend to write the same software tests

2014

While many educators have added software testing practices to their programming assignments, assessing the effectiveness of student-written tests using statement coverage or branch coverage has limitations. While researchers have begun investigating alternative approaches to assessing student-written tests, this paper reports on an investigation of the quality of student written tests in terms of the number of authentic, human-written defects those tests can detect. An experiment was conducted using 101 programs written for a CS2 data structures assignment where students implemented a queue two ways, using both an array-based and a link-based representation. Students were required to write their own software tests and graded in part on the branch coverage they achieved. Using techniques from prior work, we were able to approximate the number of bugs present in the collection of student solutions, and identify which of these were detected by each student-written test suite. The results indicate that, while students achieved an average branch coverage of 95.4% on their own solutions, their test suites were only able to detect an average of 13.6% of the faults present in the entire program population. Further, there was a high degree of similarity among 90% of the student test suites. Analysis of the suites suggest that students were following naive, "happy path" testing, writing basic test cases covering mainstream expected behavior rather than writing tests designed to detect hidden bugs. These results suggest that educators should strive to reinforce test design techniques intended to find bugs, rather than simply confirming that features work as expected.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations