正在学习神经机器翻译 (Active Learning for Neural Machine Translation)

The machine translation mechanism translates texts automatically between different natural languages, and Neural Machine Translation (NMT) has gained attention for its rational context analysis and fluent translation accuracy. However, processing low-resource languages that lack relevant training attributes like supervised data is a current challenge for Natural Language Processing (NLP). We incorporated a technique known Active Learning with the NMT toolkit Joey NMT to reach sufficient accuracy and robust predictions of low-resource language translation. With active learning, a semi-supervised machine learning strategy, the training algorithm determines which unlabeled data would be the most beneficial for obtaining labels using selected query techniques. We implemented two model-driven acquisition functions for selecting the samples to be validated. This work uses transformer-based NMT systems; baseline model (BM), fully trained model (FTM) , active learning least confidence based model (ALLCM), and active learning margin sampling based model (ALMSM) when translating English to Hindi. The Bilingual Evaluation Understudy (BLEU) metric has been used to evaluate system results. The BLEU scores of BM, FTM, ALLCM and ALMSM systems are 16.26, 22.56 , 24.54, and 24.20, respectively. The findings in this paper demonstrate that active learning techniques helps the model to converge early and improve the overall quality of the translation system.

翻译：机器翻译机制在各种自然语言之间自动翻译文本,神经机器翻译(NMT)得到合理的背景分析和流畅翻译准确性方面的注意;然而,处理缺乏相关培训属性的低资源语言(如受监督的数据)对自然语言处理(NLP)目前是一项挑战。我们采用了与NMT工具包Joey NMT一起的已知积极学习技术,以达到足够的准确性和对低资源语言翻译的可靠预测。随着积极的学习,半监督的机器学习战略,培训算法决定了哪些未贴标签的数据最有利于使用选定的查询技术获取标签。我们实施了两种模式驱动的获取功能,用于选择要验证的样本。这项工作使用了基于变压器的NMT系统;基线模型(BM)、充分培训的模型(FTM)、积极学习基于信任度最低的模型(ALMM)和在将英语翻译成印地语时积极学习边际抽样模型(ALMSMM)。双语评价基础测试(BLEU)用于评价系统结果。BM、FTM、ALMMM和ALMSM系统(ALM)的BS)的分评分数将分别用于16.26、22.56和22.LS的整文件。

相关内容

主动学习

关注 241

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

专知会员服务

39+阅读 · 2020年11月3日