Current approaches to incorporating terminology constraints in machine translation (MT) typically assume that the constraint terms are provided in their correct morphological forms. This limits their application to real-world scenarios where constraint terms are provided as lemmas. In this paper, we introduce a modular framework for incorporating lemma constraints in neural MT (NMT) in which linguistic knowledge and diverse types of NMT models can be flexibly applied. It is based on a novel cross-lingual inflection module that inflects the target lemma constraints based on the source context. We explore linguistically motivated rule-based and data-driven neural-based inflection modules and design English-German health and English-Lithuanian news test suites to evaluate them in domain adaptation and low-resource MT settings. Results show that our rule-based inflection module helps NMT models incorporate lemma constraints more accurately than a neural module and outperforms the existing end-to-end approach with lower training costs.
翻译:目前将术语限制纳入机器翻译(MT)的做法通常假定,约束术语以正确的形态形式提供,这限制了这些术语在现实世界情景中的应用,在现实世界中将约束术语以利玛斯语提供。在本文中,我们引入了一个模块框架,将莱马限制纳入神经MT(NMT),在其中可以灵活地应用语言知识和不同类型的国家 mT模式。它基于一个新的跨语言的透视模块,该模块根据来源背景对目标 Lemma限制进行分辨。我们探索语言驱动的基于规则和数据驱动的神经阻塞模块,设计英德卫生和英语立陶宛语新闻测试套件,在适应领域和低资源MT环境中对其进行评估。结果显示,我们基于规则的渗透模块有助于NMT模型比神经模块更准确地纳入莱马限制,并在培训成本较低的情况下超越现有的端对端方法。