Jupyter笔记本项目中的错误分析:经验研究 (Bug Analysis in Jupyter Notebook Projects: An Empirical Study) - 专知论文

会员服务 ·

0

Jupyter · Bug · Analysis · Projection · Taxonomy ·

2022 年 10 月 13 日

Bug Analysis in Jupyter Notebook Projects: An Empirical Study

翻译：Jupyter笔记本项目中的错误分析:经验研究

Taijara Loiola de Santana,Paulo Anselmo da Mota Silveira Neto,Eduardo Santana de Almeida,Iftekhar Ahmed

Computational notebooks, such as Jupyter, have been widely adopted by data scientists to write code for analyzing and visualizing data. Despite their growing adoption and popularity, there has been no thorough study to understand Jupyter development challenges from the practitioners' point of view. This paper presents a systematic study of bugs and challenges that Jupyter practitioners face through a large-scale empirical investigation. We mined 14,740 commits from 105 GitHub open-source projects with Jupyter notebook code. Next, we analyzed 30,416 Stack Overflow posts which gave us insights into bugs that practitioners face when developing Jupyter notebook projects. Finally, we conducted nineteen interviews with data scientists to uncover more details about Jupyter bugs and to gain insights into Jupyter developers' challenges. We propose a bug taxonomy for Jupyter projects based on our results. We also highlight bug categories, their root causes, and the challenges that Jupyter practitioners face.

翻译：诸如 Jupyter 等计算笔记本被数据科学家广泛采用,用于撰写分析和可视化数据的代码。尽管它们日益被采纳和普及,但还没有进行彻底研究,从从实践者的角度来理解Jupyter 的发展挑战。本文介绍了对Jupyter 从业人员通过大规模经验调查所面临的错误和挑战的系统研究。我们挖掘了14 740个来自105 GitHub 开放源码项目的14 740个项目。接下来,我们分析了30 416个Stack 溢流站,这些站点让我们深入了解了开业者在开发Jupyter 笔记项目时所面临的错误。最后,我们与数据科学家进行了19次访谈,以发现关于Jupyter 错误的更多细节,并了解Jupyter 开发者的挑战。我们根据我们的结果提出了Jupyter 项目的错误分类方法。我们还强调了昆虫分类、其根源以及Jupyter 开业者面临的挑战。

0

相关内容

Jupyter

Jupyter Notebook是以网页的形式打开，可以在网页页面中直接编写代码和运行代码，代码的运行结果也会直接在代码块下显示的程序。如在编程过程中需要编写说明文档，可在同一个页面中直接编写，便于作及时的说明和解释。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

Progerin/PrelaminA诱发早老症的蛋白质组学研究

国家自然科学基金

1+阅读 · 2015年12月31日

MHC-B单倍型鸡MD抗性相关miRNA的鉴定及功能靶基因研究

国家自然科学基金

0+阅读 · 2015年12月31日

玉米β-胡萝卜素羟化酶2基因种子特异性表达的调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

桑树抗寒冷基因的鉴定与功能分析

国家自然科学基金

0+阅读 · 2013年12月31日

功能性遗传变异调控BARD1/BRCA1泛素化通路的机制及与儿童神经母细胞瘤的关联研究

国家自然科学基金

0+阅读 · 2013年12月31日

苜蓿PGIP3基因的遗传变异、表达与功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

大豆AAP家族基因功能的研究

国家自然科学基金

0+阅读 · 2012年12月31日

南方根结线虫毒性变异相关基因高通量沉默及功能验证

国家自然科学基金

0+阅读 · 2012年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

新BRCA1剪接异构体在乳腺癌细胞中的功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

An Empirical Study of Security Practices for Microservices Systems

Arxiv

0+阅读 · 2022年11月18日

Estimating defection in subscription-type markets: empirical analysis from the scholarly publishing industry

Arxiv

0+阅读 · 2022年11月18日

Security Implications of Large Language Model Code Assistants: A User Study

Arxiv

0+阅读 · 2022年11月17日

Where Did My Variable Go? Poking Holes in Incomplete Debug Information

Arxiv

0+阅读 · 2022年11月17日

Empirical Study on Optimizer Selection for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2022年11月15日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

36+阅读 · 2020年9月3日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

How to train your MAML

Arxiv

26+阅读 · 2019年3月5日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《陆军战斗操练中的关键事件诊断》

《自适应训练辅助概念及其在空战管理员加速训练中的应用导论》最新126页

军事通信市场七大趋势概述

《抗干扰无人机蜂群行为的遗传算法方法》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

An Empirical Study of Security Practices for Microservices Systems

Arxiv

0+阅读 · 2022年11月18日

Estimating defection in subscription-type markets: empirical analysis from the scholarly publishing industry

Arxiv

0+阅读 · 2022年11月18日

Security Implications of Large Language Model Code Assistants: A User Study

Arxiv

0+阅读 · 2022年11月17日

Where Did My Variable Go? Poking Holes in Incomplete Debug Information

Arxiv

0+阅读 · 2022年11月17日

Empirical Study on Optimizer Selection for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2022年11月15日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

36+阅读 · 2020年9月3日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

How to train your MAML

Arxiv

26+阅读 · 2019年3月5日

相关基金

Progerin/PrelaminA诱发早老症的蛋白质组学研究

国家自然科学基金

1+阅读 · 2015年12月31日

MHC-B单倍型鸡MD抗性相关miRNA的鉴定及功能靶基因研究

国家自然科学基金

0+阅读 · 2015年12月31日

玉米β-胡萝卜素羟化酶2基因种子特异性表达的调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

桑树抗寒冷基因的鉴定与功能分析

国家自然科学基金

0+阅读 · 2013年12月31日

功能性遗传变异调控BARD1/BRCA1泛素化通路的机制及与儿童神经母细胞瘤的关联研究

国家自然科学基金

0+阅读 · 2013年12月31日

苜蓿PGIP3基因的遗传变异、表达与功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

大豆AAP家族基因功能的研究

国家自然科学基金

0+阅读 · 2012年12月31日

南方根结线虫毒性变异相关基因高通量沉默及功能验证

国家自然科学基金

0+阅读 · 2012年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

新BRCA1剪接异构体在乳腺癌细胞中的功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员