Generating high-resolution daily soil moisture by using spatial downscaling techniques: a comparison of six machine learning algorithms

2020 
Abstract Tremendous efforts have been made for obtaining surface soil moisture (SM) at high spatial resolutions from microwave-based products via spatial downscaling. In recent years, machine learning has been one of the most advanced techniques in SM spatial downscaling. The performance of a machine learning technique in SM spatial downscaling varies with the algorithm and the underlying surface; however, despite the importance of machine learning for SM downscaling, there are still only few inter-comparisons, particularly over different surfaces. In this study, the performance of multiple machine learning algorithms in downscaling the ECV (the Essential Climate Variable Program initiated by the European Space Agency) SM dataset was validated over different underlying surfaces. Six machine learning algorithms: artificial neural network (ANN), Bayesian (BAYE), classification and regression trees (CART), K nearest neighbor (KNN), random forest (RF), and support vector machine (SVM), were implemented to establish the spatial downscaling models with reliable continuous in-situ SM observations over four case study areas, including the Okalahoma Mesonet (OKM) in North America, Naqu network (NAN) in the Tibetan Plateau, REMEDHUS (REM) network in northeast Spain, and OZNNET (OZN) in southeast Australia. The land surface temperature (LST), normalized difference vegetation index (NDVI), albedo, digital elevation model (DEM), and geographic coordinates were the explanatory variables, and their contributions to the downscaling models over different surfaces were quantified. The conclusions of the experiments can be summarized as follows: (1) The RF achieved excellent performance with a high correlation coefficient and a low regression error. The BAYE and KNN also demonstrated favorable capabilities for SM downscaling; however, the robustness of their algorithms needed further improvements. Numerous abnormal values were obtained in the scale-down process by the ANN, CART, and SVM methods, suggesting their comparative inadequacy in SM downscaling. (2) Downscaled 1-km resolution SM in REM generally presented a close correlation with the in-situ measurements, and its bias was larger than that in the other three regions. Comparatively, the smallest bias with the second highest correlation was found in the OZN region. It was primarily deduced that regions that located in one single climate zone and had mild topography variation and medium vegetation coverage tended to produce high-accuracy results. (3) The feature importance index (FII) calculated by the RF model revealed that the DEM, daytime LST, and NDVI were dominant during reconstruction, particularly DEM in a study region with a large height difference. The specific FII of each independent variable varied remarkably across the different case study areas, probably owing to the complex hydrothermal as well as physical geography conditions. The results of this study demonstrate that the RF model outperforms the other models considered herein; furthermore, the effect of the FII of the variables over different underlying surfaces was demonstrated.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    125
    References
    18
    Citations
    NaN
    KQI
    []