Do We Really Reduce Bias for Scene Graph Generation

2021 
For a given image, the corresponding scene graph is a kind of structural expression which benefits to high-level tasks. To generate a meaningful and useful one, the existing models pay more attention on reducing the bias from long-tail distribution of dataset. However, they overlook the unimodal bias and evaluation bias from models themselves. In this paper, we construct an unbiased solution called Balanced Label and Vision for Multilabel Classification (BLVMC). BLVMC consists of two modules, label-vision grounding module (LVGM) and no graph constraint (NGC). Specially, the LVGM aims to be in equilibrium for label and vision by introducing visual information into label branch. This module reduces unimodal bias from previous models and makes them more stable. The NGC views the Scene Graph Generation (SGG) as a multilabel classification task instead of multiclass classification. Besides, the NGC uses the corresponding NGC mR@K to evaluate models. This module allows each subject-object pair to retain multi-predicates, which relieves evaluation bias. The quantitative and qualitative experiments on Visual Genome (VG) dataset demonstrate the proposed BLVMC effectively eliminates the above two biases and outperforms previous state-of-the-art models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    0
    Citations
    NaN
    KQI
    []