Geochemical Prospectivity Mapping Through a Feature Extraction–Selection Classification Scheme

2018 
Machine learning (ML) schemes can enhance success in geochemical prospectivity mapping. This study has examined the effectiveness of several feature extraction or selection approaches, using a variety of ML algorithms applied to multielement soil and lithogeochemical data, to identify new prospective Pb–Zn mineralisation in the Irankuh area. Singular value decomposition (SVD) was used as a dimensionality reduction technique to remove noise in the geochemical data. This was followed by application of feature selection techniques including filter-based methods such as principal component analysis (PCA), Pearson’s correlation coefficient (PCC), correlation-based feature selection (CFS), information gain ratio (IGR) and wrapper models, in combination with support vector machines, logistic regression and random forests analysis. The performance of the ML algorithms, assisted by feature extraction and selection methods, was subsequently assessed using a 10-fold cross-validation of separate training and testing data subsets. SVD boosted the performance of support vector machines, logistic regression and random forests. The ML algorithms are particularly effective when using two transformed principal components that are linked to a suite of elements associated with the sulphide mineralisation and variations in the host lithologies. PCA and PCC techniques generally suit support vector machines as the most effective feature selection methods. Logistic regression provided a better classification with PCA, IGR and a wrapper model. However, random forests delivered more accurate outcomes using PCA and PCC techniques. A geochemical prospectivity map of the study area has been derived from support vector machines, trained with two principal components as the best performing ML scheme, and has generated three new and distinct targets for more detailed exploration.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    60
    References
    3
    Citations
    NaN
    KQI
    []