Hierarchical multilabel classification by exploiting label correlations

2021 
Hierarchical multilabel classification (HMC) aims to classify the complex data such as text with multiple topics and image with multiple semantics, in which the multiple labels are organized in hierarchical structures such as trees and direct acyclic graphs (DAG). To reduce the computational complexity, HMC methods generally assume that the class labels of different branches in a hierarchical structure are conditional independent. However, these class-independent HMC methods neglect the correlation between labels and thereby the precision of classification is affected. To tackle the problem, in this paper we propose a hierarchical multilabel classification method with class label correlation (HMC-CLC) which exploits the label correlations of different branches to benefit the discrimination of HMC. Specifically, in the training stage, for each label in the hierarchy, we use feature incremental learning to encode the labels of different branches into the input space. Based on this, the label correlations of different branches are reflected by the weights of classification model in corresponding dimensions. Then in the test stage, considering that the different samples have different label distributions, we propose a greedy label selection method to dynamically decide the correlated labels of different branches for each label. Therefore, for the same label in the hierarchy, the correlated labels could be different in different samples. Experimental results on a number of real-world data sets show that the proposed method outperforms the state-of-the-art HMC methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    0
    Citations
    NaN
    KQI
    []