Prediction of non-metallic inclusions in steel wires for tire reinforcement by means of machine learning algorithms

2019 
This study was aimed at developing a reliable Machine Learning algorithm to classify castings of steel for tire reinforcement depending on the number and properties of inclusions, experimentally determined. 855 castings were available for training, validation and testing. 140 parameters are monitored during fabrication, which are the features of the analysis; the output is 1 or 0 depending on whether the casting is rejected or not. The following algorithms have been employed: Logistic Regression, K-Nearest Neighbors, Support Vector Classifier, Random Forests, AdaBoost, Gradient Boosting and Artificial Neural Networks. The reduced value of the rejection rate implies that classification must be carried out on an imbalanced dataset. Resampling methods and specific scores for imbalanced datasets (Recall, Precision and AUC rather than Accuracy) were used. Random Forest was the most successful method providing an area under the curve in the test set of 0.85. No significant improvements were detected after resampling. It has been proved that this tool allows the samples with a higher probability of being rejected to be selected, improving the effectiveness of the quality control. In addition, the optimized Random Forest has enabled to identify the most important features, which have been satisfactorily interpreted on a metallurgical basis.This study was aimed at developing a reliable Machine Learning algorithm to classify castings of steel for tire reinforcement depending on the number and properties of inclusions, experimentally determined. 855 castings were available for training, validation and testing. 140 parameters are monitored during fabrication, which are the features of the analysis; the output is 1 or 0 depending on whether the casting is rejected or not. The following algorithms have been employed: Logistic Regression, K-Nearest Neighbors, Support Vector Classifier, Random Forests, AdaBoost, Gradient Boosting and Artificial Neural Networks. The reduced value of the rejection rate implies that classification must be carried out on an imbalanced dataset. Resampling methods and specific scores for imbalanced datasets (Recall, Precision and AUC rather than Accuracy) were used. Random Forest was the most successful method providing an area under the curve in the test set of 0.85. No significant improvements were detected after resam...
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    0
    Citations
    NaN
    KQI
    []