Whale optimized mixed kernel function of support vector machine for colorectal cancer diagnosis

2019 
Abstract Microarray technique is a prevalent method for the classification and prediction of colorectal cancer (CRC). Nevertheless, microarray data suffers from the curse of dimensionality when selecting feature genes of the disease based on imbalance samples, thus causing low prediction accuracy. Hence, it is of vital significance to build proper models that can avoid the above problems and predict the CRC more accurately. In this paper, we use an ensemble model to classify samples into healthy and CRC groups and improve prediction performance. The proposed model is composed of three functional modules. The first module mainly performs the function of removing redundant genes. The main feature genes are selected using minimum redundancy maximum relevance (mRMR) method to reduce the dimensionality of features thereby increasing the prediction results. The second module aims to solve the problem caused by imbalanced data using hybrid sampling algorithm RUSBoost. The third module focuses on the classification algorithm optimization. We use mixed kernel function (MKF) based support vector machine (SVM) model to classify an unknown sample into healthy individuals and CRC patients, and then, the Whale Optimization Algorithm (WOA) is applied to find most optimal parameters of the proposed MKF-SVM. The final results show that the proposed model achieves higher G-means than other comparable models. The conclusion comes to show that RUSBoost wrapping WOA + MKF-SVM model can be applied to improve the predictive performance of colorectal cancer based on the imbalanced data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    6
    Citations
    NaN
    KQI
    []