We present FRMT, a new dataset and evaluation benchmark for Few-shot Region-aware Machine Translation, a type of style-targeted translation. The dataset consists of professional translations from English into two regional variants each of Portuguese and Mandarin Chinese. Source documents are selected to enable detailed analysis of phenomena of interest, including lexically distinct terms and distractor terms. We explore automatic evaluation metrics for FRMT and validate their correlation with expert human evaluation across both region-matched and mismatched rating scenarios. Finally, we present a number of baseline models for this task, and offer guidelines for how researchers can train, evaluate, and compare their own models. Our dataset and evaluation code are publicly available: https://bit.ly/frmt-task
翻译:我们提出FRMT,这是一份新的数据集和评价基准,用于少见的区域有识机器翻译,这是一种有风格针对性的翻译,该数据集包括专业翻译,从英文翻译成葡萄牙文和中文两种区域变体。选择了原始文件,以便能够详细分析感兴趣的现象,包括从法律上区分术语和分散内容的术语。我们探讨FRMTT的自动评价指标,并验证其与区域可比和不匹配的评级情景中专家人力评价的关联性。最后,我们提出了这项任务的一些基线模型,为研究人员如何培训、评价和比较自己的模型提供了指南。我们的数据集和评价代码可以公开查阅:https://bit.ly/frmt-task。