Quantifying Human-Perceived Answer Utility in Non-factoid Question Answering

Taking a user-centric approach, we study the features that render an answer to a non-factoid question useful in the eyes of the person who asked that question. An editorial study, where participants assess the usefulness of the answers they received in response to their questions, as well as 12 different aspects associated with the answers, indicates considerable correlation between certain aspects such as relevance, correctness, and completeness with the user-perceived usefulness of answers. Moreover, we investigate the effectiveness of some commonly used answer quality measures, such as ROGUE, BLEU, METEOR, and BERTScore, demonstrating that these measures are limited in their ability to capture the aspects of usefulness and have room for improvement. The question answering dataset created in our work was made publicly available.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader