Empirical Investigation of Hyperparameter Optimization for Software Defect Count Prediction

2021 
Abstract Prior identification of defects in software modules can help testers to allocate limited resources efficiently. Defect prediction techniques are helpful for this situation because they allow testers to identify and focus on defect prone parts of the software. The regression approach is a machine learning technique used to find the defect count in software segments. These regression techniques are more effective if their hyperparameters are adjusted. However, limited studies are available for hyperparameter optimization of regression techniques. In this paper, we investigated the impact of hyperparameter optimization on defect count prediction. In an empirical analysis on 15 software defect datasets, we find that hyperparameter optimization of learning techniques: (1) improves the prediction performance for MLPR, Lasso, DTR, Hubber, and SVR, by 16.96%, 8.31%, 8.16%, 6.01% and 5.22%, respectively; (2) linear regression is not optimization-sensitive; (3) overall grid search optimization improved the prediction performance by 4.42% while random search optimization by 3.36%.; (4) non-significant classifier have also changed their ranking substantially, and (5) logistic regression obtained the highest ranking concerning hyperparameter optimization. While both random and grid search performed well, but concerning the default parameter, the grid search always obtained better outcomes; however, it may not with random search. This emphasizes the importance of exploring the parameter space when using parameter-sensitive regression techniques.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    77
    References
    0
    Citations
    NaN
    KQI
    []