Comparison among random forest, logistic regression, and existing clinical risk scores for predicting outcomes in patients with atrial fibrillation: A report from the J-RHYTHM registry.

2021 
BACKGROUND Machine learning (ML) has emerged as a promising tool for risk stratification. However, few studies have applied ML to risk assessment of patients with atrial fibrillation (AF). HYPOTHESIS We aimed to compare the performance of random forest (RF), logistic regression (LR), and conventional risk schemes in predicting the outcomes of AF. METHODS We analyzed data from 7406 nonvalvular AF patients (median age 71 years, female 29.2%) enrolled in a nationwide AF registry (J-RHYTHM Registry) and who were followed for 2 years. The endpoints were thromboembolisms, major bleeding, and all-cause mortality. Models were generated from potential predictors using an RF model, stepwise LR model, and the thromboembolism (CHADS2 and CHA2 DS2 -VASc) and major bleeding (HAS-BLED, ORBIT, and ATRIA) scores. RESULTS For thromboembolisms, the C-statistic of the RF model was significantly higher than that of the LR model (0.66 vs. 0.59, p = .03) or CHA2 DS2 -VASc score (0.61, p < .01). For major bleeding, the C-statistic of RF was comparable to the LR (0.69 vs. 0.66, p = .07) and outperformed the HAS-BLED (0.61, p < .01) and ATRIA (0.62, p < .01) but not the ORBIT (0.67, p = .07). The C-statistic of RF for all-cause mortality was comparable to the LR (0.78 vs. 0.79, p = .21). The calibration plot for the RF model was more aligned with the observed events for major bleeding and all-cause mortality. CONCLUSIONS The RF model performed as well as or better than the LR model or existing clinical risk scores for predicting clinical outcomes of AF.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    0
    Citations
    NaN
    KQI
    []