We present the first real-world application of methods for improving neural machine translation (NMT) with human reinforcement, based on explicit and implicit user feedback collected on the eBay e-commerce platform. Previous work has been confined to simulation experiments, whereas in this paper we work with real logged feedback for offline bandit learning of NMT parameters. We conduct a thorough analysis of the available explicit user judgments---five-star ratings of translation quality---and show that they are not reliable enough to yield significant improvements in bandit learning. In contrast, we successfully utilize implicit task-based feedback collected in a cross-lingual search task to improve task-specific and machine translation quality metrics.
翻译:我们根据在eBay电子商务平台上收集的明确的和隐含的用户反馈,首次在现实世界应用改善神经机器翻译的方法,加强人的能力。先前的工作仅限于模拟实验,而在本文件中,我们的工作则是对离线土匪学习NMT参数的实时反馈进行实际记录。我们对现有明确的用户判断-五星级翻译质量评级进行透彻分析,并表明这些评级不够可靠,无法大大改进土匪学习。相比之下,我们成功地利用跨语言搜索工作中收集的基于任务的隐含反馈,以改进具体任务和机器翻译质量衡量标准。