To facilitate effective translation modeling and translation studies, one of the crucial questions to address is how to assess translation quality. From the perspectives of accuracy, reliability, repeatability and cost, translation quality assessment (TQA) itself is a rich and challenging task. In this work, we present a high-level and concise survey of TQA methods, including both manual judgement criteria and automated evaluation metrics, which we classify into further detailed sub-categories. We hope that this work will be an asset for both translation model researchers and quality assessment researchers. In addition, we hope that it will enable practitioners to quickly develop a better understanding of the conventional TQA field, and to find corresponding closely relevant evaluation solutions for their own needs. This work may also serve inspire further development of quality assessment and evaluation methodologies for other natural language processing (NLP) tasks in addition to machine translation (MT), such as automatic text summarization (ATS), natural language understanding (NLU) and natural language generation (NLG).
翻译:为了促进有效的翻译模型和翻译研究,需要解决的一个关键问题是如何评估翻译质量。从准确性、可靠性、可重复性和成本的角度来看,翻译质量评估本身是一项丰富而具有挑战性的任务。在这项工作中,我们提出对翻译质量评估方法,包括人工判断标准和自动评价标准进行高层次和简明的调查,我们将其分类为更详细的子类。我们希望这项工作将成为翻译模型研究人员和质量评估研究人员的一个资产。此外,我们希望这项工作将使从业人员能够迅速更好地了解常规的翻译质量评估领域,并找到与其自身需求密切相关的相应评价解决方案。这项工作还可能有助于激励在机器翻译(MT)之外,为其他自然语言处理(NLP)任务进一步制定质量评估和评价方法,如自动文本总和(ATS)、自然语言理解(NLU)和自然语言生成(NLG)等。