自动修复程序漏洞的静态分析方法 (Leveraging Static Analysis for Bug Repair) - 专知论文

会员服务 ·

0

静态分析 · 分析工具 · 代码 · 输出 · 分析 ·

2023 年 4 月 21 日

Leveraging Static Analysis for Bug Repair

翻译：自动修复程序漏洞的静态分析方法

Ruba Mutasim,Gabriel Synnaeve,David Pichardie,Baptiste Rozière

from arxiv, 13 pages. DL4C 2023

We propose a method combining machine learning with a static analysis tool (i.e. Infer) to automatically repair source code. Machine Learning methods perform well for producing idiomatic source code. However, their output is sometimes difficult to trust as language models can output incorrect code with high confidence. Static analysis tools are trustable, but also less flexible and produce non-idiomatic code. In this paper, we propose to fix resource leak bugs in IR space, and to use a sequence-to-sequence model to propose fix in source code space. We also study several decoding strategies, and use Infer to filter the output of the model. On a dataset of CodeNet submissions with potential resource leak bugs, our method is able to find a function with the same semantics that does not raise a warning with around 97% precision and 66% recall.

翻译：我们提出了一种结合机器学习和静态分析工具（即Infer）的方法，以自动修复源代码。机器学习方法表现良好，能够生成惯用的源代码。但是，它们的输出有时很难信任，因为语言模型可以高置信度地输出错误的代码。静态分析工具可信，但也不够灵活，且产生的代码不是惯用的。在本文中，我们建议在IR空间中修复资源泄漏漏洞，并使用序列到序列模型在源代码空间中建议修复。我们还研究了几种解码策略，并使用Infer过滤模型的输出。在一个具有潜在资源泄漏漏洞的CodeNet提交数据集上，我们的方法能够找到一个与其语义相同但不会引发警告的函数，大约精度为97%，召回率为66%。

0

相关内容

静态分析

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【WWW 2020 】基于关系对抗网络的低资源知识图谱补全，Relation Adversarial Network for Low Resource Knowledge Graph Completion

【WWW 2020 】基于关系对抗网络的低资源知识图谱补全，Relation Adversarial Network for Low Resource Knowledge Graph Completion

专知会员服务

37+阅读 · 2020年6月7日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

手把手教你写 Dart ffi

手把手教你写 Dart ffi

阿里技术

0+阅读 · 2022年11月7日

甲骨文出现可访问客户数据的云隔离漏洞，现已修复

甲骨文出现可访问客户数据的云隔离漏洞，现已修复

InfoQ

0+阅读 · 2022年9月22日

用 20+ 行 JavaScript 代码，短暂“变身” iOS 程序员！

用 20+ 行 JavaScript 代码，短暂“变身” iOS 程序员！

CSDN

0+阅读 · 2022年9月7日

Xsser 一款自动检测XSS漏洞工具

Xsser 一款自动检测XSS漏洞工具

黑白之道

14+阅读 · 2019年8月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

基于反模式自动检测的代码质量分析与重构

国家自然科学基金

0+阅读 · 2014年12月31日

恶意软件静态分析与检测关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

社交网络开放平台漏洞挖掘及威胁评估方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于轻量级虚拟机的全系统程序分析

国家自然科学基金

0+阅读 · 2012年12月31日

对象模型上交互式修复生成技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

数据挖掘和静态分析相结合的重复代码缺陷检测及重构方法

国家自然科学基金

1+阅读 · 2010年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于多版本技术的自适应编译优化方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

面向服务质量的Web服务测试技术研究

国家自然科学基金

1+阅读 · 2008年12月31日

适应多类型Insider Attack的入侵检测与精确定位方法的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Predicting the Next Action by Modeling the Abstract Goal

Arxiv

0+阅读 · 2023年6月6日

Conformal Prediction with Missing Values

Arxiv

0+阅读 · 2023年6月5日

Understanding and Supporting Debugging Workflows in Multiverse Analysis

Arxiv

0+阅读 · 2023年6月4日

Auditing for Human Expertise

Arxiv

0+阅读 · 2023年6月2日

Domain Knowledge Matters: Improving Prompts with Fix Templates for Repairing Python Type Errors

Arxiv

0+阅读 · 2023年6月2日

Automatic Translation of Hate Speech to Non-hate Speech in Social Media Texts

Arxiv

0+阅读 · 2023年6月2日

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Arxiv

0+阅读 · 2023年6月1日

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

Arxiv

14+阅读 · 2020年12月22日

Explainable Reasoning over Knowledge Graphs for Recommendation

Arxiv

11+阅读 · 2018年11月12日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【WWW 2020 】基于关系对抗网络的低资源知识图谱补全，Relation Adversarial Network for Low Resource Knowledge Graph Completion

【WWW 2020 】基于关系对抗网络的低资源知识图谱补全，Relation Adversarial Network for Low Resource Knowledge Graph Completion

专知会员服务

37+阅读 · 2020年6月7日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

手把手教你写 Dart ffi

手把手教你写 Dart ffi

阿里技术

0+阅读 · 2022年11月7日

甲骨文出现可访问客户数据的云隔离漏洞，现已修复

甲骨文出现可访问客户数据的云隔离漏洞，现已修复

InfoQ

0+阅读 · 2022年9月22日

用 20+ 行 JavaScript 代码，短暂“变身” iOS 程序员！

用 20+ 行 JavaScript 代码，短暂“变身” iOS 程序员！

CSDN

0+阅读 · 2022年9月7日

Xsser 一款自动检测XSS漏洞工具

Xsser 一款自动检测XSS漏洞工具

黑白之道

14+阅读 · 2019年8月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Predicting the Next Action by Modeling the Abstract Goal

Arxiv

0+阅读 · 2023年6月6日

Conformal Prediction with Missing Values

Arxiv

0+阅读 · 2023年6月5日

Understanding and Supporting Debugging Workflows in Multiverse Analysis

Arxiv

0+阅读 · 2023年6月4日

Auditing for Human Expertise

Arxiv

0+阅读 · 2023年6月2日

Domain Knowledge Matters: Improving Prompts with Fix Templates for Repairing Python Type Errors

Arxiv

0+阅读 · 2023年6月2日

Automatic Translation of Hate Speech to Non-hate Speech in Social Media Texts

Arxiv

0+阅读 · 2023年6月2日

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Arxiv

0+阅读 · 2023年6月1日

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

Arxiv

14+阅读 · 2020年12月22日

Explainable Reasoning over Knowledge Graphs for Recommendation

Arxiv

11+阅读 · 2018年11月12日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

相关基金

基于反模式自动检测的代码质量分析与重构

国家自然科学基金

0+阅读 · 2014年12月31日

恶意软件静态分析与检测关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

社交网络开放平台漏洞挖掘及威胁评估方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于轻量级虚拟机的全系统程序分析

国家自然科学基金

0+阅读 · 2012年12月31日

对象模型上交互式修复生成技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

数据挖掘和静态分析相结合的重复代码缺陷检测及重构方法

国家自然科学基金

1+阅读 · 2010年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于多版本技术的自适应编译优化方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

面向服务质量的Web服务测试技术研究

国家自然科学基金

1+阅读 · 2008年12月31日

适应多类型Insider Attack的入侵检测与精确定位方法的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员