A variable selection method for multiclass classification problems using two-class ROC analysis

2018 
Abstract Modern procedures in analytical chemistry generate enormous amounts of data, which must be processed and interpreted. The treatment of such high-dimensional datasets often necessitates the prior selection of a reduced number of variables in order to extract knowledge about the system under study and to maximize the predictability of the models built. Therefore, this article describes a variable selection method for multiclass classification problems using two-class ROC analysis and its associated area under the ROC curve as a variable selection criterion. The variable selection method has been successfully applied to two datasets. For comparison purposes, two other variable selection methods, ReliefF and mRMR, were used and double cross-validation PLS-DA was applied using: (1) all variables and (2) the variables selected using the three methods. It has been demonstrated that correct variable selection can substantially reduce the dimensionality of the datasets, while maximizing the predictability of the models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    7
    Citations
    NaN
    KQI
    []