Assessing Reference-Free Peer Evaluation for Machine Translation

Sweta Agrawal,George Foster,Markus Freitag,Colin Cherry

Assessing Reference-Free Peer Evaluation for Machine Translation

2021

Sweta Agrawal
George Foster
Markus Freitag
Colin Cherry

Reference-free evaluation has the potential to make machine translation evaluation substantially more scalable, allowing us to pivot easily to new languages or domains. It has been recently shown that the probabilities given by a large, multilingual model can achieve state of the art results when used as a reference-free metric. We experiment with various modifications to this model and demonstrate that by scaling it up we can match the performance of BLEU. We analyze various potential weaknesses of the approach and find that it is surprisingly robust and likely to offer reasonable performance across a broad spectrum of domains and different system qualities.

Keywords:

BLEU
reference free
State (computer science)
Theoretical computer science
Machine translation
Scalability
Computer science
Metric (mathematics)
Scaling
peer evaluation

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations