Seqtrans:通过序列到序列学习的顺序自动确定脆弱性 (SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning) - 专知论文

会员服务 ·

0

序列到序列学习 · NMT · seq2seq · Automator · INFORMS ·

2021 年 6 月 1 日

SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning

翻译：Seqtrans:通过序列到序列学习的顺序自动确定脆弱性

Jianlei Chi,Yu Qu,Ting Liu,Qinghua Zheng,Heng Yin

from arxiv, 20 pages, 18 figures, 5 tables

Software vulnerabilities are now reported at an unprecedented speed due to the recent development of automated vulnerability hunting tools. However, fixing vulnerabilities still mainly depends on programmers' manual efforts. Developers need to deeply understand the vulnerability and try to affect the system's functions as little as possible. In this paper, with the advancement of Neural Machine Translation (NMT) techniques, we provide a novel approach called SeqTrans to exploit historical vulnerability fixes to provide suggestions and automatically fix the source code. To capture the contextual information around the vulnerable code, we propose to leverage data flow dependencies to construct code sequences and fed them into the state-of-the-art transformer model. The fine-tuning strategy has been introduced to overcome the small sample size problem. We evaluate SeqTrans on a dataset containing 1,282 commits that fix 624 vulnerabilities in 205 Java projects. Results show that the accuracy of SeqTrans outperforms the latest techniques and achieves 23.3% in statement-level fix and 25.3% in CVE-level fix. In the meantime, we look deep inside the result and observe that NMT model performs very well in certain kinds of vulnerabilities like CWE-287 (Improper Authentication) and CWE-863 (Incorrect Authorization).

翻译：由于最近开发了自动脆弱性狩猎工具,软件的脆弱程度现已以前所未有的速度得到报告。然而,确定脆弱程度仍主要取决于程序员的手工工作。开发者需要深入理解脆弱性,并尽量少地影响系统功能。在本文件中,随着神经机器翻译技术的进步,我们提供了一种新颖的方法,称为SeqTrans(SeqTrans),以利用历史脆弱程度的固定方法,提供建议和自动修正源代码。为了捕捉脆弱代码周围的背景信息,我们提议利用数据流动依赖度来构建代码序列并将其输入最新变异器模型。我们采用了微调战略,以克服小规模的样本问题。我们用包含1,282个数据集来评估SeqTrans(Seq Transmission),承诺在205 Java项目中修复624个脆弱性。结果显示Seqreatrans(Sequrity)的准确性超越了最新技术,在声明级固定中达到了23.3%,在CVE级修正中达到了25.3%。与此同时,我们深入地看到NMT模型在CWE-287(GRAFI)和CRUIRAVI(GI)中非常-87)等某些脆弱性非常。

0

相关内容

序列到序列学习

序列到序列学习

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

已删除

将门创投

11+阅读 · 2019年4月26日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Automatically generating models of IT systems

Arxiv

0+阅读 · 2021年7月23日

Attention, please! A survey of Neural Attention Models in Deep Learning

Arxiv

59+阅读 · 2021年3月31日

Attention Forcing for Sequence-to-sequence Model Training

Attention Forcing for Sequence-to-sequence Model Training

Arxiv

7+阅读 · 2019年9月26日

An Introduction to Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月3日

Paraphrase Generation with Deep Reinforcement Learning

Paraphrase Generation with Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年8月23日

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Arxiv

5+阅读 · 2018年7月11日

Deep Reinforcement Learning: An Overview

Arxiv

15+阅读 · 2018年6月23日

Hierarchical Reinforcement Learning with Deep Nested Agents

Arxiv

3+阅读 · 2018年5月18日

Visual Interpretability for Deep Learning: a Survey

Arxiv

16+阅读 · 2018年2月7日

Convolutional Sequence to Sequence Learning

Arxiv

4+阅读 · 2017年7月25日

VIP会员

文章信息

相关主题

序列到序列学习

相关VIP内容

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】反事实推理在多模态对话生成中的应用

基于强化学习的智能体化搜索全面综述：基础、角色、优化、评估与应用

ICCV最佳论文出炉，朱俊彦团队用砖块积木摘得桂冠

面向具身操作的高效视觉–语言–动作模型：系统综述

相关资讯

已删除

将门创投

11+阅读 · 2019年4月26日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Automatically generating models of IT systems

Arxiv

0+阅读 · 2021年7月23日

Attention, please! A survey of Neural Attention Models in Deep Learning

Arxiv

59+阅读 · 2021年3月31日

Attention Forcing for Sequence-to-sequence Model Training

Attention Forcing for Sequence-to-sequence Model Training

Arxiv

7+阅读 · 2019年9月26日

An Introduction to Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月3日

Paraphrase Generation with Deep Reinforcement Learning

Paraphrase Generation with Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年8月23日

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Arxiv

5+阅读 · 2018年7月11日

Deep Reinforcement Learning: An Overview

Arxiv

15+阅读 · 2018年6月23日

Hierarchical Reinforcement Learning with Deep Nested Agents

Arxiv

3+阅读 · 2018年5月18日

Visual Interpretability for Deep Learning: a Survey

Arxiv

16+阅读 · 2018年2月7日

Convolutional Sequence to Sequence Learning

Arxiv

4+阅读 · 2017年7月25日

微信扫码咨询专知VIP会员