Complementary or Substitutive? A Novel Deep Learning Method to Leverage Text-image Interactions for Multimodal Review Helpfulness Prediction

2022 
With the flourishing of mobile Internet, the multimodal reviews (i.e., reviews with both texts and images) are becoming prevalent and playing an important role in customer decision makings. However, when making multimodal review helpfulness prediction (MRHP), it becomes difficult due to the information interaction between text and images. The information in review text (images) can be either complementary or substitutive to visual (textual) review information. Moreover, the text (images) itself may constitute the review’s diagnostic value predominantly in some cases, whereas they could be jointly perceived as useful by customers in others. In this study, we delve to conduct MRPH by modeling their text-image interactions. We proposed a novel multimodal that exploits the complementation and substitution effects between text and images and further coordinates them for MRHP. Empirical evaluation on a large-scale online review dataset shows that our proposed method outperformed the benchmarks, indicating its powerful capability to predict the helpfulness of multimodal reviews. Exploratory analysis renders insights for understanding the complementary-substitutive interaction patterns between review text and images.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []