Semi-supervised Learning with Label Proportion

2021 
The scarcity of labels is common and great challenge in traditional supervised learning. Semi-supervised learning (SSL) leverages unlabeled samples to alleviate the absence of label information. Similar with annotation, label proportion is another type of prior information and plays a significant role in classification tasks. Compared with the acquisition of labels, label proportion can be obtained more easily. For example, only a small number of patients have been diagnosed with or not with cancers in hospital database, while the proportion with cancer can be generally estimated by historical records. How to incorporate such prior information of label proportion is crucial but rarely studied in literature. Traditional SSL methods often ignore this prior information and will lead to performance degradation inevitably. To solve this problem, we propose a novel SSL with Label Proportion (SSLLP). Our approach encourages to preserve label consistency and label proportion by imposing the cardinality bound constraints. Our formulated problem equals to a mixed-integer constrained submodular minimization and it is difficult to be solved directly. Therefore, we transformed the original problem into a convex one by Lov $\acute{\text{a}}$ sz extension and designed an efficient solving algorithm. Extensive experimental results present the improved performance of our method over several state-of-the-art methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []