Attention LSTM for Scene Graph Generation

2021 
A scene graph is a structured representation of an image that consists of objects detected within the image and the relationships between them. Scene graphs can play a significant role in understanding an image. However, to date, research regarding scene graph generation has met with limited success, and the performance of the existing methods is unsatisfactory. Based on the fused form of the features of detected objects and the way a message passes between them, we have developed a novel deep learning network, dubbed Attention Long Short-term Memory Network (ALNet), that can generate scene graphs more effectively. We have designed a simple feature fusion module that combines the spatial features of the detected objects with their visual features and semantics. To represent connections between all related objects, we introduced a message passing module that can transfer the features of detected objects through the bidirectional ALNet. By means of various ablation experiments, we evaluated the ALNet and confirmed that the scene graph generation indicators were significantly improved.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []