以变换器为基础的法国问题解答任务模型的可用性 (On the Usability of Transformers-based models for a French Question-Answering task)

For many tasks, state-of-the-art results have been achieved with Transformer-based architectures, resulting in a paradigmatic shift in practices from the use of task-specific architectures to the fine-tuning of pre-trained language models. The ongoing trend consists in training models with an ever-increasing amount of data and parameters, which requires considerable resources. It leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for English. This raises questions about their usability when applied to small-scale learning problems, for which a limited amount of training data is available, especially for under-resourced languages tasks. The lack of appropriately sized corpora is a hindrance to applying data-driven and transfer learning-based approaches with strong instability cases. In this paper, we establish a state-of-the-art of the efforts dedicated to the usability of Transformer-based models and propose to evaluate these improvements on the question-answering performances of French language which have few resources. We address the instability relating to data scarcity by investigating various training strategies with data augmentation, hyperparameters optimization and cross-lingual transfer. We also introduce a new compact model for French FrALBERT which proves to be competitive in low-resource settings.

翻译：对于许多任务而言,以变换器为基础的结构已经取得了最先进的成果,导致从使用任务特定结构到微调经过训练的语文模式的做法发生了范式转变。目前的趋势是培训模式,数据和参数数量不断增加,需要大量资源。这导致大力寻求根据仅对英文进行评价的算法和硬件改进来提高资源效率。这提出了在应用到小规模学习问题时是否可使用的问题,因为这方面的培训数据数量有限,特别是资源不足的语言任务。缺乏适当规模的组合阻碍了在不稳定的情况下采用以数据驱动和转让为基础的学习方法。我们在本文件中建立了专门致力于使用变换器模型的先进工作,并提议评价这种改进,因为法文的解答能力很少。我们通过研究各种培训战略,通过数据扩增、超参数优化和跨语言转移,解决与数据稀缺有关的不稳定问题。我们还采用了一种以数据增强、超基准优化和跨语言转移为主的低标准模式。我们还在法国的TRA中引入了一种具有竞争力的新型资源模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日