Finding Images by Dialoguing with Image

2019 
Image retrieval in complicated scene is a challenging task that requires the comprehensive understanding of an image. In this paper, we propose a scene graph based image retrieval framework that combines the scene graph generation with image retrieval and fine tuning the searching results via a dialogue mechanism. Specifically, we proposed an image retrieval oriented scene graph generation model that takes an image and a text describing the image as inputs. The additional text input is used to control the generated scene graph. It provides information for a newly introduced attributes head to better predict the attributes and helps constructing an adjacency matrix at the same time. Graph Convolutional Network is further used to gather information among nodes for precise relation estimation. Moreover, modification on the scene graph can be done by changing the text. Our proposed approach achieves the state-of-the-art performances in both scene graph based image retrieval and scene graph generation in the Visual Genome dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    73
    References
    0
    Citations
    NaN
    KQI
    []