Flaky tests (tests with non-deterministic outcomes) can be problematic for testing efficiency and software reliability. Flaky tests in test suites can also significantly delay software releases. There have been several studies that attempt to quantify the impact of test flakiness in different programming languages (e.g., Java and Python) and application domains (e.g., mobile and GUI-based). In this paper, we conduct an empirical study of the state of flaky tests in JavaScript. We investigate two aspects of flaky tests in JavaScript projects: the main causes of flaky tests in these projects and common fixing strategies. By analysing 452 commits from large, top-scoring JavaScript projects from GitHub, we found that flakiness caused by concurrency-related issues (e.g., async wait, race conditions or deadlocks) is the most dominant reason for test flakiness. The other top causes of flaky tests are operating system-specific (e.g., features that work on specific OS or OS versions) and network stability (e.g., internet availability or bad socket management). In terms of how flaky tests are dealt with, the majority of those flaky tests (>80%) are fixed to eliminate flaky behaviour and developers sometimes skip, quarantine or remove flaky tests.
翻译: Flaky 测试( 非确定性结果的测试) 可能会对测试效率和软件可靠性产生问题。 测试套件中的 Flaky 测试也可能大大拖延软件的释放。 已经进行了几项研究,试图量化不同程序语言( 如爪哇和Python) 和应用领域( 如移动和图形用户基) 测试的不成熟性影响。 在本文中,我们对 JavaScript 的不成熟性测试状态进行实证性研究。 我们调查了JavaScript 项目中闪烁性测试的两个方面: 这些项目中闪烁性测试的主要原因和共同的修复战略。 通过分析452 来自GitHub 的大型、最尖端的 JavaScript 项目, 我们发现,与货币有关的问题( 如等、 种族条件或僵局) 导致不成熟性测试的最主要原因。 我们调查了系统测试的其他主要原因( 例如, 消除特定OS或OS版本的工作特点和共同修复战略的主要原因), 网络稳定性( 例如, road stock study studub test ) 测试如何消除了这些 。