Evaluation of Machine Learning predictions of a highly resolved long time series of Chlorophyll-a concentration

2021 
Pelagic Chlorophyll-a concentrations are key for evaluation of the environmental status and productivity of marine systems. In this study, chlorophyll-a concentrations for the Helgoland Roads Time Series were modeled using a number of measured water and environmental parameters. We chose three common Machine Learning algorithms from the literature: Support Vector Machine Regressor, Neural Networks Multi-layer Perceptron Regressor and Random Forest Regressor. Results showed that Support Vector Machine Regressor slightly outperformed other models. The evaluation with a test dataset and verification with an independent validation dataset for chlorophyll-a concentrations showed a good generalization capacity, evaluated by the root mean squared errors of less than 1 µg L-1. Feature selection and engineering are important and improved the models significantly, as measured in performance, improving by a minimum of 48% the adjusted R2. We tested SARIMA in comparison and found that the univariate nature of SARIMA does not allow for better results than the Machine Learning models. Additionally, the computer processing time needed was much higher (prohibitive) for SARIMA.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    2
    Citations
    NaN
    KQI
    []