IDT5:印度尼西亚多语言T5变换器 (idT5: Indonesian Version of Multilingual T5 Transformer)

Indonesian language is spoken by almost 200 million people and is the 10th most spoken language in the world, but it is under-represented in NLP (Natural Language Processing) research. A sparsity of language resources has hampered previous work on Indonesian. The Transformer is a new architecture rapidly becoming dominant for NLP, surpassing alternatives like convolutional and recurrent neural networks. T5 (Text-to-Text Transfer Transformer) is a Transformer model that converts all text-based language problems to text-to-text format for English. The multilingual variant is mT5 (multilingual T5) which has shown promising results on many NLP tasks across languages. However, the size of this multilingual model is a drawback for its application in real production applications, which sometimes require only one language. In this study, the mT5 model was adapted for only one language, Indonesian, resulting in a pre-trained T5 model that was specific only for Indonesian with a smaller size. For performance comparison, we fine-tuned this model and the mT5 model to the Sentiment Analysis (SA), Question Generation (QG), and Question Answering (QA) tasks with the exact mechanism and dataset. Fine-tuned model based on our model achieved 77.18% accuracy on SA, 8% higher than the mT5-based model, and obtained nearly the same score as the mT5-based model on QG and QA. The results confirm that it is possible to produce a smaller pre-trained model that maintains comparable yields while reducing the model size by up to 58%. In addition, the resulting model requires less memory, loads faster, and inference times faster.

翻译：印尼语言由近2亿人口使用,是世界第10种最通用的语言,但在NLP(自然语言处理)的研究中,该语言的代表性不足。语言资源的广度阻碍了印度尼西亚先前的工作。变换器是一个新架构,对NLP来说,它迅速成为主流。变换器是一个新架构,超越了神经网络的变异和经常性等替代物。T5(从文本到文本传输变换器)是一个变异器模型,将所有基于文本的语言问题转换成英语的文本换文本格式。多语种变异器是 mT5(多语种T5),它在许多NLP(语言处理)任务上显示了令人乐观的结果。然而,这一多语言模式的规模是将其应用于实际生产应用程序的回溯,有时只需要一种语言。在本研究中,MT5模型只针对一种语言,印度尼西亚,因此,经过预先训练的T5模型只适用于规模较小的印度尼西亚人。为了业绩比较,我们通过更小的模型将这一模型和 mT5T5模型维持更小的模型,在Sentimation 分析(SA)、QFoleGDDDD(QGDDDD) 和Sildalimal Real laimal) lax lax a lax lax the smodemodemoudal lax the smodel the smodel the smodel the smodel the smodel the smodel mod model model lax lax lax lax lax lad smodel lad the smodel in the smode the smodel in smodel model model lax the smodel lax the smodel (SA) the sild the sild the sal model (SA) the slad the slad the slad the slad the smodel model (SA) model model (SA, ladal-ladal laddal) lad the slad the slad the s lad the sal) ladal) lad the sal-ladal-model ladal-lad sal-ladal-

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日