对基于UI的 Flakky 测试进行的经验分析 (An Empirical Analysis of UI-based Flaky Tests)

Flaky tests have gained attention from the research community in recent years and with good reason. These tests lead to wasted time and resources, and they reduce the reliability of the test suites and build systems they affect. However, most of the existing work on flaky tests focus exclusively on traditional unit tests. This work ignores UI tests that have larger input spaces and more diverse running conditions than traditional unit tests. In addition, UI tests tend to be more complex and resource-heavy, making them unsuited for detection techniques involving rerunning test suites multiple times. In this paper, we perform a study on flaky UI tests. We analyze 235 flaky UI test samples found in 62 projects from both web and Android environments. We identify the common underlying root causes of flakiness in the UI tests, the strategies used to manifest the flaky behavior, and the fixing strategies used to remedy flaky UI tests. The findings made in this work can provide a foundation for the development of detection and prevention techniques for flakiness arising in UI tests.

翻译：近年来,研究界对Flaky测试给予了关注,并有充分的理由。这些测试导致时间和资源浪费,降低了测试套件的可靠性,并建造了它们所影响的系统。然而,目前关于片片测试的大部分工作完全侧重于传统的单位测试。这项工作忽略了具有比传统单位测试更大的输入空间和更加多样化运行条件的UI测试。此外,UI测试往往更为复杂,而且资源密集,因此不适合于涉及再运行测试套件的检测技术。在本文中,我们进行了关于闪烁式UI测试的研究。我们分析了在62个项目中发现的来自网络和机器人环境的235个闪烁式UI测试样本。我们确定了UI测试中闪烁式的常见根本原因、用于显示闪烁式行为的战略以及用于纠正阻燃式测试的固定战略。在这项工作中得出的研究结果可以为开发对UI测试中产生的不毛化的检测和预防技术奠定基础。