关于在恢复使用预先培训的自然语言处理模式的同时发现的虫虫的经验研究 (An Empirical Study on the Bugs Found while Reusing Pre-trained Natural Language Processing Models)

In NLP, reusing pre-trained models instead of training from scratch has gained popularity; however, NLP models are mostly black boxes, very large, and often require significant resources. To ease, models trained with large corpora are made available, and developers reuse them for different problems. In contrast, developers mostly build their models from scratch for traditional DL-related problems. By doing so, they have control over the choice of algorithms, data processing, model structure, tuning hyperparameters, etc. Whereas in NLP, due to the reuse of the pre-trained models, NLP developers are limited to little to no control over such design decisions. They either apply tuning or transfer learning on pre-trained models to meet their requirements. Also, NLP models and their corresponding datasets are significantly larger than the traditional DL models and require heavy computation. Such reasons often lead to bugs in the system while reusing the pre-trained models. While bugs in traditional DL software have been intensively studied, the nature of extensive reuse and black-box structure motivates us to understand the different types of bugs that occur while reusing NLP models? What are the root causes of those bugs? How do these bugs affect the system? To answer these questions, We studied the bugs reported while reusing the 11 popular NLP models. We mined 9,214 issues from GitHub repositories and identified 984 bugs. We created a taxonomy with bug types, root causes, and impacts. Our observations led to several findings, including limited access to model internals resulting in a lack of robustness, lack of input validation leading to the propagation of algorithmic and data bias, and high-resource consumption causing more crashes, etc. Our observations suggest several bug patterns, which would greatly facilitate further efforts in reducing bugs in pre-trained models and code reuse.

翻译：在NLP中,使用预先培训的模型而不是从零开始的培训越来越受欢迎;然而,在NLP中,NLP模型大多是黑盒,非常大,往往需要大量资源。为了方便,可以提供经过大型公司体培训的模型,而开发者则根据不同的问题再利用这些模型。相反,开发者大多是为了传统的DLL问题而从零开始建立模型。这样,他们就能控制对算法的选择、数据处理、模型结构、调控超参数等等。而在NLP中,由于经过事先培训的模型的重新利用,NLP模型大多是黑盒,但NLP的开发者们很少能够控制这些设计决定。他们要么对预先培训的模型进行调试,要么是传授,要么是传授,要么是传授,要么是传授,要么是传授,要么是传授,要么是传授。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日