SMOTE-SMO-based expert system for type II diabetes detection using PIMA dataset

2021 
Medical data, which is critical to human existence, is used to identify potential people prone to any specific complication or disease by the application of appropriate data mining (DM) techniques. DM is specifically applied to extract details for diagnosis, prediction, prevention, and treatment of various diseases. According to the International Diabetes Federation (IDF) 2019 atlas report, diabetes caused 4.2 million deaths over the globe, and hence, it is critical to diagnose diabetes at an early stage. Even though many techniques are available to diagnose diabetes, the methods are not efficient to find hidden patterns with the desired accuracy for correct decision-making. Thus, this paper presents an integrated approach of synthetic minority oversampling technique (SMOTE) and sequential minimal optimization (SMO) algorithms for predicting diabetes. In this proposed two-phase classification model, the first step is pre-processing of data using the SMOTE algorithm, and the second step is SMO classifier. The output of the pre-processing is given to SMO to increase the performance of the classifier. This classification model achieved an accuracy rate of 99.07% on the PIMA Indian diabetes dataset (PIDD) using our proposed approach. PIDD has been taken from UCI repository for this proposed work; however, the National Institute of Diabetes and digestive kidney disease owned the PIDD. The dataset contains 768 female patients, details each with 8 numeric and one decision class attribute. The output of the study confirms that the proposed integrated approach of DM could be used as an expert system for diagnosing diabetes in patients at an early stage. The extracted features from this study will be used for the development of a prognostic tool in the form of a mobile application for early diabetes detection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    0
    Citations
    NaN
    KQI
    []