The performance of many Fault Localisation (FL) techniques directly depends on the quality of the used test suites. Consequently, it is extremely useful to be able to precisely measure how much diagnostic power each test case can introduce when added to a test suite used for FL. Such a measure can help us not only to prioritise and select test cases to be used for FL, but also to effectively augment test suites that are too weak to be used with FL techniques. We propose FDG, a new measure of Fault Diagnosability Gain for individual test cases. The design of FDG is based on our analysis of existing metrics that are designed to prioritise test cases for better FL. Unlike other metrics, FDG exploits the ongoing FL results to emphasise the parts of the program for which more information is needed. Our evaluation of FDG with Defects4J shows that it can successfully help the augmentation of test suites for better FL. When given only a few failing test cases (2.3 test cases on average), FDG can effectively augment the given test suite by prioritising the test cases generated automatically by EvoSuite: the augmentation can improve the acc@1 and acc@10 of the FL results by 11.6x and 2.2x on average, after requiring only ten human judgements on the correctness of the assertions EvoSuite generates.
翻译:许多断层本地化(FL)技术的性能直接取决于所用测试套件的质量。 因此,精确测量每个测试案例在用于 FL 的测试套件中可以引入多少诊断力是非常有用的。 这样的措施不仅能帮助我们优先考虑和选择用于 FL 的测试套件,而且能有效地扩大太弱、无法用于 FL 技术的测试套件。 我们建议FDG, 一种用于单个测试案例的错解析增益的新措施。 FDG的设计是基于我们对设计为更好地FL 的优先测试案例而设计的现有指标的分析。 与其他指标不同, FDG利用当前FL 的结果来强调程序中需要更多信息的部分。 我们对FDG Dects4J的评估表明,它能够成功地帮助扩大测试套件用于更好的 FL。 当仅给出少数失败的测试案例(平均2.3个测试案例)时, FDDG的设计可以有效地通过将EvoS 10 平均结果自动的测试案例与EvoS 10x 预测结果的EVVCx 改进。