Digital mapping of soil properties using multiple machine learning in a semi-arid region, central Iran

2019 
Abstract Knowledge about distribution of soil properties over the landscape is required for a variety of land management applications and resources, modeling, and monitoring practices. The main aim of this research was to conduct a spatially prediction of the top soil properties such as soil organic carbon (SOC), calcium carbonate equivalent (CCE), and clay content using digital soil mapping (DSM) approaches in Borujen region, Chaharmahal-Va-Bakhtiari province, central Iran. To achieve this goal, a total of 334 soil samples were collected from 0 to 30 cm depth. Three non-linear models including Cubist (Cu), Random Forest (RF), Regression Tree (RT) and a Multiple Linear Regression (MLR) were used to link environmental covariates and the studied soil properties. The environmental covariates were obtained from a digital elevation model (DEM) and satellite imagery (Landsat Enhanced Thematic Mapper; ETM). The model was calibrated and validated by the 10-fold cross-validation approach. Root mean square error (RMSE) and coefficient of determination (R 2 ) were used to determine the performance of the models, and relative RMSE (RMSE%) was used to define prediction accuracy. According to the RMSE and R 2 , Cu and RF resulted in the most accurate predictions for CCE (R 2  = 0.30 and RMSE = 9.52) and clay contents (R 2  = 0.15 and RMSE = 7.86), respectively, while both of RF and Cu models showed the highest performance to predict SOC content (R 2  = 0.55). Results showed that remote sensing covariates (Ratio Vegetation Index and band 4) were the most important variables to explain the variability of SOC and CCE content, but only topographic attributes were responsible for clay content variation. According to RMSE% results, it could be concluded that the best model is not necessarily able to make the most accurate estimation. This study recommended that more observations and denser sampling should be carried out in the entire study area. Alternatively, stratified sampling by elevation in homogeneous sub-areas was recommended. The stratified sampling probably will increase the performance of models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    70
    References
    78
    Citations
    NaN
    KQI
    []