Surrogate-assisted search-based testing (SA-SBT) aims to reduce the computational time for testing compute-intensive systems. Surrogates enhance testing techniques by improving test case generation focusing the testing budget on the most critical portions of the input domain. In addition, they can serve as approximations of the system under test (SUT) to predict tests' results instead of executing the tests on compute-intensive SUTs. This article reflects on the existing SA-SBT techniques, particularly those applied to system-level testing and often facilitated using simulators or complex test beds. Our objective is to synthesize different heuristic algorithms and evaluation methods employed in existing SA-SBT techniques and present a comprehensive view of SA-SBT solutions. In addition, by critically reviewing our previous work on SA-SBT, we aim to identify the limitations in our proposed algorithms and evaluation methods and to propose potential improvements. We present a taxonomy that categorizes and contrasts existing SA-SBT solutions and highlights key research gaps. To identify the evaluation challenges, we conduct two replication studies of our past SA-SBT solutions: One study uses industrial advanced driver assistance system (ADAS) and the other relies on a Simulink model benchmark. We compare our results with those of the original studies and identify the difficulties in evaluating SA-SBT techniques, including the impact of different contextual factors on results generalization and the validity of our evaluation metrics. Based on our taxonomy and replication studies, we propose future research directions, including re-considerations in the current evaluation metrics used for SA-SBT solutions, utilizing surrogates for fault localization and repair in addition to testing, and creating frameworks for large-scale experiments by applying SA-SBT to multiple SUTs and simulators.
翻译:替代模型辅助的基于搜索的测试(SA-SBT)旨在缩短计算密集型系统测试的计算时间。替代模型通过提高测试用例生成的能力来增强测试技术,将测试预算集中在输入域的最关键部分。此外,替代模型还可以作为系统在测试中的近似值来预测测试结果,而不是在计算密集型系统上执行测试。本文对现有的SA-SBT技术进行反思,特别是应用于系统级测试的技术,通常使用模拟器或复杂的测试平台。我们的目标是综合现有的SA-SBT技术中使用的不同启发式算法和评估方法,并呈现SA-SBT解决方案的全面视角。此外,通过对我们以前关于SA-SBT的工作进行批判性回顾,我们旨在确定我们提出的算法和评估方法的局限性,并提出潜在的改进。我们提出了一个分类,将现有的SA-SBT解决方案进行分类和对比,并突出了关键的研究空白。为了确定评估挑战,我们对我们以前的SA-SBT解决方案进行了两个复制研究:一项研究使用工业级先进驾驶辅助系统(ADAS),另一项则依赖于Simulink模型基准。我们将结果与原始研究进行比较,并确定评估SA-SBT技术的困难,包括不同上下文因素对结果概括的影响和我们评估指标的有效性。基于我们的分类法和复制研究,我们提出了未来的研究方向,包括重新考虑用于SA-SBT解决方案的当前评估指标,利用替代模型进行故障定位和修复而不仅仅是测试,并通过将SA-SBT应用于多个SUT和模拟器来创建大规模实验框架。