Recognizing that even correct translations are not always semantically equivalent, we automatically detect meaning divergences in parallel sentence pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation. We show that our semantic model detects divergences more accurately than models based on surface features derived from word alignments, and that these divergences matter for neural machine translation.
翻译:我们认识到,即使正确的译文也不一定具有等同的音义,因此,我们自动发现平行的句子与双语语义相似的深层神经模型的差别,这种模型可以在无需人工注解的情况下为任何平行的音质进行训练。 我们表明,我们的语义模型比基于单词对齐所产生的表面特征的模型更准确地检测差异,而这些差异对于神经机器的翻译很重要。