强化学习试验个案优先排序 (Reinforcement Learning for Test Case Prioritization)

Continuous Integration (CI) significantly reduces integration problems, speeds up development time, and shortens release time. However, it also introduces new challenges for quality assurance activities, including regression testing, which is the focus of this work. Though various approaches for test case prioritization have shown to be very promising in the context of regression testing, specific techniques must be designed to deal with the dynamic nature and timing constraints of CI. Recently, Reinforcement Learning (RL) has shown great potential in various challenging scenarios that require continuous adaptation, such as game playing, real-time ads bidding, and recommender systems. Inspired by this line of work and building on initial efforts in supporting test case prioritization with RL techniques, we perform here a comprehensive investigation of RL-based test case prioritization in a CI context. To this end, taking test case prioritization as a ranking problem, we model the sequential interactions between the CI environment and a test case prioritization agent as an RL problem, using three alternative ranking models. We then rely on carefully selected and tailored state-of-the-art RL techniques to automatically and continuously learn a test case prioritization strategy, whose objective is to be as close as possible to the optimal one. Our extensive experimental analysis shows that the best RL solutions provide a significant accuracy improvement over previous RL-based work, with prioritization strategies getting close to being optimal, thus paving the way for using RL to prioritize test cases in a CI context.

翻译：(CI) 持续整合(CI) 显著减少整合问题,加快发展时间,缩短释放时间;然而,它也为质量保证活动带来了新的挑战,包括回归测试,这是这项工作的重点。虽然在回归测试方面,测试案件优先排序的各种方法显示在测试回归测试方面很有希望,但必须设计具体技术,以应对CI的动态性质和时间安排限制。最近,强化学习(RL)在需要不断适应的各种具有挑战性的情景中显示出巨大潜力,这些情景需要不断适应,例如游戏游戏、实时广告招标和建议者系统等。在这项工作的激励下,并在支持以RL技术优先测试案件优先排序的初步努力的基础上,我们在这里对基于RL的测试案件优先排序进行全面调查。为此,将测试案件优先排序作为一个排名问题,我们将CI环境之间的相继互动和测试案件优先排序的代理作为RL问题进行模拟。我们随后依靠精心选择和量身定制的基于最新水平的技术,自动和持续学习测试案件优先排序战略,其目标在于接近R的优先度,从而尽可能接近最佳的R的优先度。