Machine reading comprehension (MRC) is a challenging task in natural language processing that makes computers understanding natural language texts and answer questions based on those texts. There are many techniques for solving this problems, and word representation is a very important technique that impact most to the accuracy of machine reading comprehension problem in the popular languages like English and Chinese. However, few studies on MRC have been conducted in low-resource languages such as Vietnamese. In this paper, we conduct several experiments on neural network-based model to understand the impact of word representation to the Vietnamese multiple-choice machine reading comprehension. Our experiments include using the Co-match model on six different Vietnamese word embeddings and the BERT model for multiple-choice reading comprehension. On the ViMMRC corpus, the accuracy of BERT model is 61.28% on test set.
翻译:机器阅读理解(MRC)是自然语言处理中的一项艰巨任务,使计算机能够理解自然语言文本并回答基于这些文本的问题。有许多方法可以解决这个问题,字面表达是一种非常重要的技术,对英语和中文等流行语言机读理解问题的准确性影响最大。然而,关于机器阅读理解(MRC)的研究很少用越南语等低资源语言进行。在本文中,我们进行了几项神经网络模型实验,以了解单词表达对越南多种选择机器阅读理解的影响。我们的实验包括使用越南六种不同词嵌入的共配模型和多选择阅读理解(BERT)模型。在VIMRC(VIMRC)中,BERT模型的精确度是测试集的61.28%。