How to Evaluate Your Dialogue Models: A Review of Approaches.

Xinmeng Li,Wansen Wu,Long Qin,Quan-Jun Yin

How to Evaluate Your Dialogue Models: A Review of Approaches.

2021

Xinmeng Li
Wansen Wu
Long Qin
Quan-Jun Yin

Evaluating the quality of a dialogue system is an understudied problem. The recent evolution of evaluation method motivated this survey, in which an explicit and comprehensive analysis of the existing methods is sought. We are first to divide the evaluation methods into three classes, i.e., automatic evaluation, human-involved evaluation and user simulator based evaluation. Then, each class is covered with main features and the related evaluation metrics. The existence of benchmarks, suitable for the evaluation of dialogue techniques are also discussed in detail. Finally, some open issues are pointed out to bring the evaluation method into a new frontier.

Keywords:

Machine learning
Computer science
Artificial intelligence
evaluation methods
quality
Class (computer programming)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations