Full model selection using regression trees for numeric predictions of biomarkers for metabolic challenges in dairy cows.

2021 
Abstract Dairy cows suffer poor metabolic adaptation syndrome (PMAS) 1 during early post-calving periods caused by negative energy balance. Measurement of blood beta-hydroxy butyric acid (BHBA) 2 and blood non-esterified fatty acids (NEFA) 3 allow early and accurate detection of negative energy balance. Machine learning prediction of blood BHBA and blood NEFA using milk testing samples represents an opportunity to identify at-risk animals, using less labor than direct blood testing methods. Routine milk testing on modern dairies and computer record keeping provide an immense amount of data which can then be used in machine learning models. Previous research for predicting blood metabolites using Fourier-transform infrared spectroscopy (FTIR) 4 milk data has focused mainly on individual models rather than a comparison among the models. Full model selection is the process of comparing different combinations of pre-processing methods, variable selection, and statistical learning algorithms to determine which model results in the lowest prediction error for a given dataset. For this project we used a full model selection approach with regression trees (rtFMS) 5 . rtFMS uses the cross-validated performance of different model configurations to feed a regression tree for selecting a final model. A total of 384 possible model configurations (algorithms, predictors and data preprocessing options) for each outcome (blood BHBA and blood NEFA) were considered in the rtFMS technique. rtFMS allows direct comparison of multiple modeling approaches reducing bias due to empirical knowledge, modeling habits, or preferences, identifying the model with minimal root mean squared prediction error (RMSE) 6 . An elastic net regression model was selected as the best performing model for both biomarkers. The input data for blood BHBA predictions were FTIR milk spectra, with a second derivative pre-processing, and a filter with 212 wave numbers, obtaining RMSE = 0.354 (0.328−0.392). The best performing model for blood NEFA had input data of FTIR milk spectra, with a second derivative pre-processing, and a filter with 212 wave numbers filter along with the time of milking, obtaining RMSE = 0.601 (0.564−0.654). The comparison of multiple modeling strategies, conducted by rtFMS, present an option for improved FTIR prediction models of blood BHBA and blood NEFA by reducing error due to human bias. The implementation of rtFMS to design future prediction models can guide model inputs and features. Our prediction models have the potential to increase early detection of metabolic disorders in dairy cows during the transition period.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []