With the increasing release of powerful language models trained on large code corpus (e.g. CodeBERT was trained on 6.4 million programs), a new family of mutation testing tools has arisen with the promise to generate more "natural" mutants in the sense that the mutated code aims at following the implicit rules and coding conventions typically produced by programmers. In this paper, we study to what extent the mutants produced by language models can semantically mimic the observable behavior of security-related vulnerabilities (a.k.a. Vulnerability-mimicking Mutants), so that designing test cases that are failed by these mutants will help in tackling mimicked vulnerabilities. Since analyzing and running mutants is computationally expensive, it is important to prioritize those mutants that are more likely to be vulnerability mimicking prior to any analysis or test execution. Taking this into account, we introduce VMMS, a machine learning based approach that automatically extracts the features from mutants and predicts the ones that mimic vulnerabilities. We conducted our experiments on a dataset of 45 vulnerabilities and found that 16.6% of the mutants fail one or more tests that are failed by 88.9% of the respective vulnerabilities. More precisely, 3.9% of the mutants from the entire mutant set are vulnerability-mimicking mutants that mimic 55.6% of the vulnerabilities. Despite the scarcity, VMMS predicts vulnerability-mimicking mutants with 0.63 MCC, 0.80 Precision, and 0.51 Recall, demonstrating that the features of vulnerability-mimicking mutants can be automatically learned by machine learning models to statically predict these without the need of investing effort in defining such features.
翻译:随着在大型代码(例如,CodBERT在640万个程序上受过训练的强大语言模型的日益发布)的强大语言模型的日益释放,产生了一套新的突变测试工具,并承诺产生更多的“自然”变异器,因为变异代码的目的是遵循程序员通常产生的隐含规则和编码惯例。在本文件中,我们研究语言模型产生的变异器在多大程度上能够以静默方式模仿与安全相关脆弱性(a.k.a.d.didable-mimical Mutats)的可见行为,从而设计这些变异器失败的测试案例将有助于解决变异种脆弱性。由于分析和运行变异器的计算成本非常昂贵,因此必须优先处理那些在任何分析或测试之前更可能具有脆弱性的变异种变异体。考虑到这一点,我们采用了一种基于机器学习的方法,自动提取变异种特征并预测与变异种脆弱性(a.k.a.remimic Muts)有关的易变异特性。我们进行了45个脆弱性的实验,发现变异体的16.6%的变异模型无法显示一个或更多的变变变变变变变变变变变的变变变的变变变模型,而的变变变变变变变变变变变变变变变变变变变的变的变变变变变的变变变变变变变变变变变的变变变变变的变变变变变变变变变变变的变的变的变变变变的变变变变变变的变特征正是的变的变的变变变变变变的变变变变变变变变变的变变的变的变变变变变的变变变变变变变的变变变变变变变变变变的变,这的变。</s>