利用面具中中中子覆盖率对NLP模型进行白箱测试 (White-box Testing of NLP models with Mask Neuron Coverage)

Recent literature has seen growing interest in using black-box strategies like CheckList for testing the behavior of NLP models. Research on white-box testing has developed a number of methods for evaluating how thoroughly the internal behavior of deep models is tested, but they are not applicable to NLP models. We propose a set of white-box testing methods that are customized for transformer-based NLP models. These include Mask Neuron Coverage (MNCOVER) that measures how thoroughly the attention layers in models are exercised during testing. We show that MNCOVER can refine testing suites generated by CheckList by substantially reduce them in size, for more than 60\% on average, while retaining failing tests -- thereby concentrating the fault detection power of the test suite. Further we show how MNCOVER can be used to guide CheckList input generation, evaluate alternative NLP testing methods, and drive data augmentation to improve accuracy.

翻译：最近的文献显示,人们越来越有兴趣使用黑箱战略,如用于测试NLP模型的行为的 " 检查列表 " 等。白箱测试研究开发了一些方法,用以评估深度模型的内部行为是如何彻底测试的,但不适用于NLP模型。我们提出了一套为基于变压器的NLP模型量身定制的白色箱测试方法。这些方法包括:测量模型中的注意层在测试过程中的完整度。我们表明,MNCOVER可以改进由CRList生成的测试套件,将其平均大大缩小60%以上,同时保留失败测试 -- -- 从而集中测试套件的故障检测能力。我们进一步展示了如何使用MNCOVERV来指导检查列表输入生成,评估替代 NLP测试方法,并驱动数据增强,以提高准确性。

相关内容

白盒

关注 0

白盒测试（也称为透明盒测试，玻璃盒测试，透明盒测试和结构测试）是一种软件测试方法，用于测试应用程序的内部结构或功能，而不是其功能（即黑盒测试）。在白盒测试中，系统的内部视角以及编程技能被用来设计测试用例。测试人员选择输入以遍历代码的路径并确定预期的输出。这类似于测试电路中的节点，在线测试（ICT）。白盒测试可以应用于软件测试过程的单元，集成和系统级别。尽管传统的测试人员倾向于将白盒测试视为在单元级别进行的，但如今它已越来越频繁地用于集成和系统测试。它可以测试单元内的路径，集成期间单元之间的路径以及系统级测试期间子系统之间的路径。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日