Machine learning-based source identification and spatial prediction of heavy metals in soil in a rapid urbanization area, eastern China

2020 
Abstract Accelerated urbanization has resulted in the accumulation of considerable amounts of heavy metals (HMs) in urban soils. It is important to identify correlations between the urbanization process and HM accumulation in the soil and predict the spatial distribution of soil HMs based on variables related to urban expansion, so that strategies for urban soil management can be created. However, accurate predictions of urban soil HMs based on predictors associated with urbanization are still lacking. In this study, 251 topsoil samples (0-20 cm) were collected using the grid-sampling method (2 km × 2 km) in a rapid urbanization area (Hefei City, China). The concentrations of As, Zn, Pb, Hg, Ni, Cu, Cr, and Cd in the soil, as well as some attributes of soil that were impacted by urbanization were determined. The areas of different land use types in a specific grid, urbanization history, and soil properties of the site were used as predictors. The overall distribution of soil HMs were then predicted using random forest (RF), artificial neural network (ANN), and support vector machine (SVM) models. The results showed that the concentrations of As, Zn, Pb, Hg, Cu, and Cd increased significantly with an increase in urbanization history. However, the highest concentrations of Ni and Cr were observed in soils between the 2nd and 3rd ring road. According to the RF model, soil CaO, OM, sulfur, phosphorus, and surrounded built-up area were identified as the most important factors for soil Zn, Pb, Cu, and Cd, indicating a predominant anthropogenic control of these HMs. The level of Hg in the soil was also likely related to human emissions because of the importance of urbanization history and the surrounded constructing area (CA) in governing the spatial distribution of Hg. The influence of Fe2O3, Al2O3, and SiO2 on soil As, Ni, and Cr indicates their primary origin from natural processes. In comparison, the SVM and RF model revealed higher R2 and lower error indices than those of the ANN model, suggesting that SVM and RF have the ability to predict urban soil HMs satisfactorily. By using independent predictors for soil HM prediction, ANN, RF, and SVM also produced significant predictions. Furthermore, the performance of the ANN, RF, SVM models were expected to be improved by introducing variables that can reflect the sources, transport, and retention of HMs in urban soils.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    64
    References
    11
    Citations
    NaN
    KQI
    []