One-step spectral rotation clustering for imbalanced high-dimensional data

2021 
Abstract The class distribution of imbalanced data sets is skewed in practical application. As traditional clustering methods mainly are designed for improving the overall learning performance, the majority class usually tends to be clustered and the minority class which is more valuable maybe ignored. Moreover, existing clustering methods can be limited for the performance of imbalanced and high-dimensional domains. In this paper, we present one-step spectral rotation clustering for imbalanced high-dimensional data (OSRCIH) by integrating self-paced learning and spectral rotation clustering in a unified learning framework, where sample selection and dimensionality reduction are simultaneously considered with mutual and iterative update. Specifically, the imbalance problem is considered by selecting the same number of training samples from each intrinsic group of the training data, where the sample-weight vector is obtained by self-paced learning. Moreover, dimensionality reduction is conducted by combining subspace learning and feature selection. Experimental analysis on synthetic datasets and real datasets showed that OSRCIH could recognize and enhance the weight of important samples and features so as to avoid the clustering method in favor of the majority class and to improve effectively the clustering performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    57
    References
    2
    Citations
    NaN
    KQI
    []