Context: Machine learning (ML) may enable effective automated test generation. Objective: We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges. Methods: We perform a systematic mapping on a sample of 124 publications. Results: ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property-based, and expected output oracles. Supervised learning - often based on neural networks - and reinforcement learning - often based on Q-learning - are common, and some publications also employ unsupervised or semi-supervised learning. (Semi-/Un-)Supervised approaches are evaluated using both traditional testing metrics and ML-related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. Conclusion: Work-to-date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed - and how they are applied - benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.
翻译:背景:机器学习可以实现有效的自动化测试生成。目标:我们对新兴研究进行分类,并检查测试实践、研究人员目标、应用的机器学习技术、评估和挑战。方法:我们对124篇公开论文进行了系统化映射。结果:机器学习可为系统、GUI、单元、性能和组合测试生成输入,或改进现有的生成方法。机器学习还可用于生成测试判定、基于属性的和期望的输出检验。受监督学习(通常基于神经网络)和强化学习(通常基于Q学习)很常见,一些出版物也采用无监督或半监督学习。(半/无) 监督方法使用传统测试数据指标和与机器学习相关的指标(例如准确性)进行评估,而强化学习通常使用与奖励函数相关的测试指标进行评估。 结论:迄今为止的工作显示出巨大的潜力,但在训练数据、重新培训、可扩展性、评估复杂性、机器学习算法的使用方式以及基准和可复制性方面存在开放性挑战。我们的发现可作为研究人员在本领域研究的路线图和灵感来源。