Python Flaky测试经验研究 (An Empirical Study of Flaky Tests in Python) - 专知论文

会员服务 ·

0

Python · Java · Better · CASES · 可理解性 ·

2021 年 1 月 22 日

An Empirical Study of Flaky Tests in Python

翻译：Python Flaky测试经验研究

Martin Gruber,Stephan Lukasczyk,Florian Kroiß,Gordon Fraser

from arxiv, 11 pages, to be published in the Proceedings of the IEEE International Conference on Software Testing, Verification and Validation (ICST 2021)

Tests that cause spurious failures without any code changes, i.e., flaky tests, hamper regression testing, increase maintenance costs, may shadow real bugs, and decrease trust in tests. While the prevalence and importance of flakiness is well established, prior research focused on Java projects, thus raising the question of how the findings generalize. In order to provide a better understanding of the role of flakiness in software development beyond Java, we empirically study the prevalence, causes, and degree of flakiness within software written in Python, one of the currently most popular programming languages. For this, we sampled 22352 open source projects from the popular PyPI package index, and analyzed their 876186 test cases for flakiness. Our investigation suggests that flakiness is equally prevalent in Python as it is in Java. The reasons, however, are different: Order dependency is a much more dominant problem in Python, causing 59% of the 7571 flaky tests in our dataset. Another 28% were caused by test infrastructure problems, which represent a previously undocumented cause of flakiness. The remaining 13% can mostly be attributed to the use of network and randomness APIs by the projects, which is indicative of the type of software commonly written in Python. Our data also suggests that finding flaky tests requires more runs than are often done in the literature: A 95% confidence that a passing test case is not flaky on average would require 170 reruns.

翻译：导致虚假失败的测试,而没有任何代码变化,例如,片片测试,阻碍回归测试,增加维护成本,可能给真实的错误带来阴影,降低测试信任度。虽然不耐烦的流行程度和重要性已经确立,但先前的研究侧重于爪哇项目,从而提出了有关结果如何笼统化的问题。为了更好地理解在爪哇以外软件开发中的不耐烦作用,我们从经验上研究用目前最受欢迎的编程语言之一Python编写的软件中的流行程度、原因和不耐烦程度。为此,我们从流行的 PyPI 软件集指数中抽取了22352个开放源项目,并分析了它们中的876186个测试案例。我们的调查表明,在Python项目中,不耐烦躁性的情况同样普遍。但是,在Python软件开发过程中,对秩序的依赖性决定了59%。另外的28%是由测试基础设施问题造成的,而测试往往是无证的消燥性原因。剩下的13%的文献大多可以归结为在网络和A类随机性测试中,而常规测试要求的是,对A类常规的测试是常规的路径。

0

相关内容

Python

Python是一种面向对象的解释型计算机程序设计语言，在设计中注重代码的可读性，同时也是一种功能强大的通用型语言。

【MIT干货书】机器学习算法视角，126页pdf

【MIT干货书】机器学习算法视角，126页pdf

专知会员服务

78+阅读 · 2021年1月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python计算导论，560页pdf，Introduction to Computing Using Python

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

75+阅读 · 2020年5月5日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【硬核书】数学博弈论与应用，431页pdf，Mathematical Game Theory and Applications

【硬核书】数学博弈论与应用，431页pdf，Mathematical Game Theory and Applications

专知会员服务

170+阅读 · 2020年4月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Python机器学习教程资料/代码

Python机器学习教程资料/代码

机器学习研究会

8+阅读 · 2018年2月22日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

A Methodology for Approaching the Integration of Complex Robotics Systems Illustrated through a Bi-manual Manipulation Case-Study

Arxiv

0+阅读 · 2021年3月18日

The impact of using biased performance metrics on software defect prediction research

Arxiv

0+阅读 · 2021年3月18日

On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations

Arxiv

0+阅读 · 2021年3月18日

Tilted Empirical Risk Minimization

Arxiv

0+阅读 · 2021年3月17日

An Analysis of Security Vulnerabilities in Container Images for Scientific Data Analysis

Arxiv

0+阅读 · 2021年3月17日

PythonFOAM: In-situ data analyses with OpenFOAM and Python

Arxiv

0+阅读 · 2021年3月17日

On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Exploratory Study

Arxiv

0+阅读 · 2021年3月17日

Are deep learning models superior for missing data imputation in large surveys? Evidence from an empirical comparison

Arxiv

0+阅读 · 2021年3月14日

Equivalent Causal Models

Arxiv

5+阅读 · 2020年12月10日

Analyzing Uncertainty in Neural Machine Translation

Arxiv

6+阅读 · 2018年2月28日

VIP会员

文章信息

相关主题

相关VIP内容

【MIT干货书】机器学习算法视角，126页pdf

【MIT干货书】机器学习算法视角，126页pdf

专知会员服务

78+阅读 · 2021年1月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python计算导论，560页pdf，Introduction to Computing Using Python

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

75+阅读 · 2020年5月5日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【硬核书】数学博弈论与应用，431页pdf，Mathematical Game Theory and Applications

【硬核书】数学博弈论与应用，431页pdf，Mathematical Game Theory and Applications

专知会员服务

170+阅读 · 2020年4月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Python机器学习教程资料/代码

Python机器学习教程资料/代码

机器学习研究会

8+阅读 · 2018年2月22日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

相关论文

A Methodology for Approaching the Integration of Complex Robotics Systems Illustrated through a Bi-manual Manipulation Case-Study

Arxiv

0+阅读 · 2021年3月18日

The impact of using biased performance metrics on software defect prediction research

Arxiv

0+阅读 · 2021年3月18日

On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations

Arxiv

0+阅读 · 2021年3月18日

Tilted Empirical Risk Minimization

Arxiv

0+阅读 · 2021年3月17日

An Analysis of Security Vulnerabilities in Container Images for Scientific Data Analysis

Arxiv

0+阅读 · 2021年3月17日

PythonFOAM: In-situ data analyses with OpenFOAM and Python

Arxiv

0+阅读 · 2021年3月17日

On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Exploratory Study

Arxiv

0+阅读 · 2021年3月17日

Are deep learning models superior for missing data imputation in large surveys? Evidence from an empirical comparison

Arxiv

0+阅读 · 2021年3月14日

Equivalent Causal Models

Arxiv

5+阅读 · 2020年12月10日

Analyzing Uncertainty in Neural Machine Translation

Arxiv

6+阅读 · 2018年2月28日

微信扫码咨询专知VIP会员