Multilabel Deep Visual-Semantic Embedding

2019 
Inspired by the great success from deep convolutional neural networks (CNNs) for single-label visual-semantic embedding, we exploit extending these models for multilabel images. We propose a new learning paradigm for multilabel image classification, in which labels are ranked according to its relevance to the input image. In contrast to conventional CNN models that learn a latent vector representation (i.e. the image embedding vector), the developed visual model learns a mapping (i.e. a transformation matrix) from an image in an attempt to differentiate between its relevant and irrelevant labels. Despite the conceptual simplicity of our approach, the proposed model achieves state-of-the-art results on three public benchmark datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    6
    Citations
    NaN
    KQI
    []