Neural Machine translation is a challenging task due to the inherent complex nature and the fluidity that natural languages bring. Nonetheless, in recent years, it has achieved state-of-the-art performance in several language pairs. Although, a lot of traction can be seen in the areas of multilingual neural machine translation (MNMT) in the recent years, there are no comprehensive survey done to identify what approaches work well. The goal of this paper is to investigate the realm of low resource languages and build a Neural Machine Translation model to achieve state-of-the-art results. The paper looks to build upon the mBART language model and explore strategies to augment it with various NLP and Deep Learning techniques like back translation and transfer learning. This implementation tries to unpack the architecture of the NMT application and determine the different components which offers us opportunities to amend the said application within the purview of the low resource languages problem space.
翻译:神经机器翻译是一项具有挑战性的任务,由于自然语言的内在复杂性和流动性,它具有固有的复杂性。尽管近年来在多语言神经机器翻译(MNMT)领域取得了很多进展,但尚未进行综合调查以确定哪些方法奏效。本文旨在调查低资源语言领域,并建立一个神经机器翻译模型,以实现最先进的结果。本论文旨在建立在mBART语言模型的基础上,并探索采用各种NLP和深度学习技术如反向翻译和迁移学习等策略对其进行增强。本次实现试图解包NMT应用程序的体系结构,并确定不同的组件,这为我们在低资源语言问题空间的范围内提供了修改该应用程序的机会。