Machine learning may enable the automated generation of test oracles. We have characterized emerging research in this area through a systematic literature review examining oracle types, researcher goals, the ML techniques applied, how the generation process was assessed, and the open research challenges in this emerging field. Based on a sample of 22 relevant studies, we observed that ML algorithms generated test verdict, metamorphic relation, and - most commonly - expected output oracles. Almost all studies employ a supervised or semi-supervised approach, trained on labeled system executions or code metadata - including neural networks, support vector machines, adaptive boosting, and decision trees. Oracles are evaluated using the mutation score, correct classifications, accuracy, and ROC. Work-to-date show great promise, but there are significant open challenges regarding the requirements imposed on training data, the complexity of modeled functions, the ML algorithms employed - and how they are applied - the benchmarks used by researchers, and replicability of the studies. We hope that our findings will serve as a roadmap and inspiration for researchers in this field.
翻译:机器学习可以自动生成测试器。我们通过系统文献审查,检查试验器类型、研究人员目标、ML技术的应用、如何评估生成过程、以及这个新兴领域的公开研究挑战,确定了这一领域正在出现的研究特点。根据22项相关研究的抽样,我们观察到ML算法产生了测试结果、变形关系和(最常见的)预期输出器。几乎所有研究都采用了监督或半监督的方法,对标签系统处决或代码元数据进行了培训,包括神经网络、支持矢量机、适应性增强和决定树。甲骨文是通过突变得分、正确的分类、准确性和ROC来评估的。 迄今的工作显示了巨大的前景,但在对培训数据的要求、模型功能的复杂性、所使用的ML算法及其应用方式、研究人员使用的基准以及研究的可复制性等方面,存在着巨大的公开挑战。我们希望我们的研究结果将成为该领域研究人员的路线图和灵感。