Characterizing and Modeling for Proactive Disk Failure Prediction to Improve Reliability of Data Centers

2020 
In modern datacenter, hard disk drive has the highest failure rate. Current storage system has data protection feature to avoid data loss caused by disk failure. However, data reconstruction process always slows down or even suspends system services. If disk failures can be predicted accurately, data protection mechanism can be performed before disk failures really happen. Disk failure prediction dramatically improve the reliability and availability of storage system. This paper analyzes disk SMART data features in detail. According the analysis results, we design an effective feature extraction and preprocessing method. And we have optimized the XGBoost’s hyperparameters. Finally, ensemble learning is applied to further improve the accuracy of prediction. The experimental results of Alibaba data set show that our system predict disk failures within 30 days. And the F-score achieves 39.98.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []