Machine-Learning Predictions of High Arsenic and High Manganese at Drinking Water Depths of the Glacial Aquifer System, Northern Continental United States.

2021 
Globally, over 200 million people are chronically exposed to arsenic (As) and/or manganese (Mn) from drinking water. We used machine-learning (ML) boosted regression tree (BRT) models to predict high As (>10 μg/L) and Mn (>300 μg/L) in groundwater from the glacial aquifer system (GLAC), which spans 25 states in the northern United States and provides drinking water to 30 million people. Our BRT models' predictor variables (PVs) included recently developed three-dimensional estimates of a suite of groundwater age metrics, redox condition, and pH. We also demonstrated a successful approach to significantly improve ML prediction sensitivity for imbalanced data sets (small percentage of high values). We present predictions of the probability of high As and high Mn concentrations in groundwater, and uncertainty, at two nonuniform depth surfaces that represent moving median depths of GLAC domestic and public supply wells within the three-dimensional model domain. Predicted high likelihood of anoxic condition (high iron or low dissolved oxygen), predicted pH, relative well depth, several modeled groundwater age metrics, and hydrologic position were all PVs retained in both models; however, PV importance and influence differed between the models. High-As and high-Mn groundwater was predicted with high likelihood over large portions of the central part of the GLAC.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    120
    References
    3
    Citations
    NaN
    KQI
    []