Comparative study of Ensemble learning Algorithms on Early Stage Diabetes Risk Prediction

2021 
Diabetes Mellitus is amongst continuously proliferating diseases with considerable death estimate all over the world. It is defined by the level of a sugar molecule derived from glucose in the blood. Many techniques have been invented for predicting the risk of this disease. Adequate and concise data of diabetic patients is required in order to predict diabetes in early stage. In this paper, 520 records of a hospital situated in Bangladesh have been used for prediction. This dataset is publically available at UCI. After feature selection, we have applied XG Boost, Random Forest, Gradient Boosting, and Bagging algorithm. Random Forest algorithm has been found to have the best test accuracy with 99.03% and 96.88% accuracy in 10-fold cross validation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []