Understanding Large Language Model Based Fuzz Driver Generation - 专知论文

会员服务 ·

0

语言模型化 · 可理解性 · MoDELS · API · CASES ·

2023 年 7 月 24 日

Understanding Large Language Model Based Fuzz Driver Generation

翻译：暂无翻译

Cen Zhang,Mingqiang Bai,Yaowen Zheng,Yeting Li,Xiaofei Xie,Yuekang Li,Wei Ma,Limin Sun,Yang Liu

from arxiv, 17 pages, 14 figures

Fuzz drivers are a necessary component of API fuzzing. However, automatically generating correct and robust fuzz drivers is a difficult task. Compared to existing approaches, LLM-based (Large Language Model) generation is a promising direction due to its ability to operate with low requirements on consumer programs, leverage multiple dimensions of API usage information, and generate human-friendly output code. Nonetheless, the challenges and effectiveness of LLM-based fuzz driver generation remain unclear. To address this, we conducted a study on the effects, challenges, and techniques of LLM-based fuzz driver generation. Our study involved building a quiz with 86 fuzz driver generation questions from 30 popular C projects, constructing precise effectiveness validation criteria for each question, and developing a framework for semi-automated evaluation. We designed five query strategies, evaluated 36,506 generated fuzz drivers. Furthermore, the drivers were compared with manually written ones to obtain practical insights. Our evaluation revealed that: while the overall performance was promising (passing 91% of questions), there were still practical challenges in filtering out the ineffective fuzz drivers for large scale application; basic strategies achieved a decent correctness rate (53%), but struggled with complex API-specific usage questions. In such cases, example code snippets and iterative queries proved helpful; while LLM-generated drivers showed competent fuzzing outcomes compared to manually written ones, there was still significant room for improvement, such as incorporating semantic oracles for logical bugs detection.

翻译：暂无翻译

0

相关内容

语言模型化

语言模型化

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Utilizing Hybrid Trajectory Prediction Models to Recognize Highly Interactive Traffic Scenarios

Arxiv

0+阅读 · 2023年9月13日

Bias Amplification Enhances Minority Group Performance

Arxiv

0+阅读 · 2023年9月13日

The Sky Above The Clouds

Arxiv

15+阅读 · 2022年5月14日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型中的事件抽取：方法、模态与未来展望的全面综述

美海军作战管理系统：变革战场空间的二十年

【MIT博士论文】以语言为中心的医学影像理解

俄罗斯“沙希德”/“天竺葵”攻击无人机

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

相关论文

Utilizing Hybrid Trajectory Prediction Models to Recognize Highly Interactive Traffic Scenarios

Arxiv

0+阅读 · 2023年9月13日

Bias Amplification Enhances Minority Group Performance

Arxiv

0+阅读 · 2023年9月13日

The Sky Above The Clouds

Arxiv

15+阅读 · 2022年5月14日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员