工业规模的IR基于IR的臭虫本地化:脸书的视角 (Industry-scale IR-based Bug Localization: A Perspective from Facebook) - 专知论文

会员服务 ·

0

Bug · Performer · Facebook · 可约的 · 可辨认的 ·

2021 年 3 月 17 日

Industry-scale IR-based Bug Localization: A Perspective from Facebook

翻译：工业规模的IR基于IR的臭虫本地化:脸书的视角

Vijayaraghavan Murali,Lee Gross,Rebecca Qian,Satish Chandra

We explore the application of Information Retrieval (IR) based bug localization methods at a large industrial setting, Facebook. Facebook's code base evolves rapidly, with thousands of code changes being committed to a monolithic repository every day. When a bug is detected, it is often time-sensitive and imperative to identify the commit causing the bug in order to either revert it or fix it. This is complicated by the fact that bugs often manifest with complex and unwieldy features, such as stack traces and other metadata. Code commits also have various features associated with them, ranging from developer comments to test results. This poses unique challenges to bug localization methods, making it a highly non-trivial operation. In this paper we lay out several practical concerns for industry-level IR-based bug localization, and propose Bug2Commit, a tool that is designed to address these concerns. We also assess the effectiveness of existing IR-based localization techniques from the software engineering community, and find that in the presence of complex queries or documents, which are common at Facebook, existing approaches do not perform as well as Bug2Commit. We evaluate Bug2Commit on three applications at Facebook: client-side crashes from the mobile app, server-side performance regressions, and mobile simulation tests for performance. We find that Bug2Commit outperforms the accuracy of existing approaches by up to 17%, leading to reduced time for triaging regressions and attributing bugs found in simulations.

翻译：我们探索在大型工业环境中应用基于信息检索的错误本地化方法(IR) 。 Facebook 的代码基础会迅速演变, 每天有数千个代码修改被投入一个单一的存储器。当检测到一个错误时, 我们往往需要时间敏感和紧迫的时间来辨别导致错误的操作, 以便恢复它或修复它。由于错误通常以复杂和不易操作的功能, 比如堆积痕迹和其他元数据等, 这一点更加复杂。代码承诺也具有与其相关的各种特征, 从开发者评论到测试结果。这给错误本地化方法带来了独特的挑战, 使得它成为高度非三重操作。在本文中, 我们为行业一级基于 IR 的错误本地化定位提出了一些实际问题, 并提出了解决这些关切的工具。我们还评估了软件工程界现有的基于 IR 的本地化技术的有效性, 并且发现在Facebook 常见的复杂查询或文件中, 现有方法不会像 Bug2 Committ 那样运行 BAR2 。我们评估了行业一级服务器上的三个服务器的性能测试程序。

0

相关内容

Bug

程序猿的天敌有时是一个不能碰的magic

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

426+阅读 · 2021年1月11日

机器学习的安全问题及隐私保护

专知会员服务

40+阅读 · 2020年12月20日

【大规模机器学习】综述论文，20页pdf，A Survey on Large-scale Machine

【大规模机器学习】综述论文，20页pdf，A Survey on Large-scale Machine

专知会员服务

66+阅读 · 2020年8月13日

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

专知会员服务

27+阅读 · 2020年4月5日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

45+阅读 · 2020年1月1日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

196+阅读 · 2019年12月19日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

CCF推荐 | 国际会议信息10条

CCF推荐 | 国际会议信息10条

Call4Papers

8+阅读 · 2019年5月27日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

【推荐】直接未来预测：增强学习监督学习

【推荐】直接未来预测：增强学习监督学习

机器学习研究会

6+阅读 · 2017年11月24日

Physical Fault Injection and Side-Channel Attacks on Mobile Devices: A Comprehensive Survey

Arxiv

0+阅读 · 2021年5月12日

MOD: Benchmark for Military Object Detection

Arxiv

2+阅读 · 2021年5月11日

Simplified Data Wrangling with ir_datasets

Arxiv

0+阅读 · 2021年5月10日

Recovering individual-level spatial inference from aggregated binary data

Arxiv

0+阅读 · 2021年5月6日

DeepObfuscator: Obfuscating Intermediate Representations with Privacy-Preserving Adversarial Learning on Smartphones

Arxiv

0+阅读 · 2021年5月6日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations

Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations

Arxiv

4+阅读 · 2019年9月2日

Mobile big data analysis with machine learning

Mobile big data analysis with machine learning

Arxiv

6+阅读 · 2018年8月2日

The Users' Perspective on the Privacy-Utility Trade-offs in Health Recommender Systems

Arxiv

5+阅读 · 2018年4月13日

Current Challenges and Visions in Music Recommender Systems Research

Arxiv

7+阅读 · 2018年3月21日

VIP会员

文章信息

相关主题

相关VIP内容

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

426+阅读 · 2021年1月11日

机器学习的安全问题及隐私保护

专知会员服务

40+阅读 · 2020年12月20日

【大规模机器学习】综述论文，20页pdf，A Survey on Large-scale Machine

【大规模机器学习】综述论文，20页pdf，A Survey on Large-scale Machine

专知会员服务

66+阅读 · 2020年8月13日

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

专知会员服务

27+阅读 · 2020年4月5日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

45+阅读 · 2020年1月1日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

196+阅读 · 2019年12月19日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军特种作战条令》最新102页

《洛克希德SR-71“黑鸟”侦察机动力系统》21页slides

美空军作战实验室通过人工智能和指挥控制技术创新推进杀伤链

《指挥控制能力分析方法论》最新报告

相关资讯

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

CCF推荐 | 国际会议信息10条

CCF推荐 | 国际会议信息10条

Call4Papers

8+阅读 · 2019年5月27日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

【推荐】直接未来预测：增强学习监督学习

【推荐】直接未来预测：增强学习监督学习

机器学习研究会

6+阅读 · 2017年11月24日

相关论文

Physical Fault Injection and Side-Channel Attacks on Mobile Devices: A Comprehensive Survey

Arxiv

0+阅读 · 2021年5月12日

MOD: Benchmark for Military Object Detection

Arxiv

2+阅读 · 2021年5月11日

Simplified Data Wrangling with ir_datasets

Arxiv

0+阅读 · 2021年5月10日

Recovering individual-level spatial inference from aggregated binary data

Arxiv

0+阅读 · 2021年5月6日

DeepObfuscator: Obfuscating Intermediate Representations with Privacy-Preserving Adversarial Learning on Smartphones

Arxiv

0+阅读 · 2021年5月6日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations

Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations

Arxiv

4+阅读 · 2019年9月2日

Mobile big data analysis with machine learning

Mobile big data analysis with machine learning

Arxiv

6+阅读 · 2018年8月2日

The Users' Perspective on the Privacy-Utility Trade-offs in Health Recommender Systems

Arxiv

5+阅读 · 2018年4月13日

Current Challenges and Visions in Music Recommender Systems Research

Arxiv

7+阅读 · 2018年3月21日

微信扫码咨询专知VIP会员