Automated debugging techniques, such as Fault Localisation (FL) or Automated Program Repair (APR), are typically designed under the Single Fault Assumption (SFA). However, in practice, an unknown number of faults can independently cause multiple test case failures, making it difficult to allocate resources for debugging and to use automated debugging techniques. Clustering algorithms have been applied to group the test failures according to their root causes, but their accuracy can often be lacking due to the inherent limits in the distance metrics for test cases. We introduce a new test distance metric based on hypergraphs and evaluate their accuracy using multi-fault benchmarks that we have built on top of Defects4J and SIR. Results show that our technique, Hybiscus, can automatically achieve perfect clustering (i.e., the same number of clusters as the ground truth number of root causes, with all failing tests with the same root cause grouped together) for 418 out of 605 test runs with multiple test failures. Better failure clustering also allows us to separate different root causes and apply FL techniques under SFA, resulting in saving up to 82% of the total wasted effort when compared to the state-of-the-art technique for multiple fault localisation.
翻译:自动调试技术,如失灵定位(FL)或自动程序修补(APR),通常是在单一失灵假设(SFA)下设计的。然而,在实践中,数量不详的故障数量可能独立导致多个测试案例失败,因此难以分配调试资源和使用自动调试技术。对测试失败按其根源分组应用了集束算法,但由于测试案例的距离度量的内在限制,其准确性可能往往缺乏。我们根据超光速引入新的测试距离度指标,并使用我们在Deffects4J和SIR上方建立的多错基准评估其准确性。结果显示,我们的Hybiscus技术可以自动实现完美的集束(即与根数的地面真数相同),在605项测试中,418项测试的同一根数加在一起进行的所有失败测试都会导致多重测试失败。我们还可以将不同的根源分开,并将FL技术应用在SFA之下,从而在将地方级技术的完全浪费率保存到82%的情况下,从而将局部功率保存到全部浪费的技术。