Machine translation (MT) requires a wide range of linguistic capabilities, which current end-to-end models are expected to learn implicitly by observing aligned sentences in bilingual corpora. In this work, we ask: \emph{How well do MT models learn coreference resolution from implicit signal?} To answer this question, we develop an evaluation methodology that derives coreference clusters from MT output and evaluates them without requiring annotations in the target language. We further evaluate several prominent open-source and commercial MT systems, translating from English to six target languages, and compare them to state-of-the-art coreference resolvers on three challenging benchmarks. Our results show that the monolingual resolvers greatly outperform MT models. Motivated by this result, we experiment with different methods for incorporating the output of coreference resolution models in MT, showing improvement over strong baselines.
翻译:机器翻译(MT)需要广泛的语言能力,目前端到端模式通过在双语公司中遵守一致的句子来隐含地学习。在这项工作中,我们询问:\emph{MT模式从隐含信号中学习共同参考分辨率有多好?}为了回答这个问题,我们开发了一种评价方法,从MT产出中得出共同参照组,并在不需要目标语言说明的情况下对其进行评价。我们进一步评估了几个著名的开放源码和商业的MT系统,从英语翻译成六种目标语言,并将其与三个挑战性基准的最新共同参考确定者进行比较。我们的结果显示,单语言的解决方案大大超越了MT模式。受这一结果的激励,我们尝试了不同方法,将共同参考分辨率模型的输出纳入MT,显示了强基线的改进。