CodeNav: Beyond tool-use to using real-world codebases with LLM agents - 专知论文

会员服务 ·

0

Agent · 大语言模型 · 代码 · TOOLS · Performance ·

2024 年 6 月 18 日

CodeNav: Beyond tool-use to using real-world codebases with LLM agents

翻译：暂无翻译

Tanmay Gupta,Luca Weihs,Aniruddha Kembhavi

We present CodeNav, an LLM agent that navigates and leverages previously unseen code repositories to solve user queries. In contrast to tool-use LLM agents that require ``registration'' of all relevant tools via manual descriptions within the LLM context, CodeNav automatically indexes and searches over code blocks in the target codebase, finds relevant code snippets, imports them, and uses them to iteratively generate a solution with execution feedback. To highlight the core-capabilities of CodeNav, we first showcase three case studies where we use CodeNav for solving complex user queries using three diverse codebases. Next, on three benchmarks, we quantitatively compare the effectiveness of code-use (which only has access to the target codebase) to tool-use (which has privileged access to all tool names and descriptions). Finally, we study the effect of varying kinds of tool and library descriptions on code-use performance, as well as investigate the advantage of the agent seeing source code as opposed to natural descriptions of code. All code will be made open source under a permissive license.

翻译：暂无翻译

0

相关内容

Agent

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

RL解决'BipedalWalkerHardcore-v2' (SOTA)效率效果均第一及完整源代码

RL解决'BipedalWalkerHardcore-v2' (SOTA)效率效果均第一及完整源代码

CreateAMind

19+阅读 · 2019年7月18日

RL解决'BipedalWalkerHardcore-v2' (SOTA)

RL解决'BipedalWalkerHardcore-v2' (SOTA)

CreateAMind

31+阅读 · 2019年7月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

CVE-2018-7600 - Drupal 7.x 远程代码执行exp

CVE-2018-7600 - Drupal 7.x 远程代码执行exp

黑客工具箱

14+阅读 · 2018年4月17日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

细菌信使RNA非翻译区衍生小RNA的生物学功能研究

国家自然科学基金

1+阅读 · 2015年12月31日

高频ZnO/IDT/SiO2/金刚石SAW乳腺癌抗原免疫传感器研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

Android移动终端多语种基础软件组合的安全技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

基于UGC的应急响应决策支持系统关键技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

On the impact of VR/AR applications on optical transport networks: First experiments with Meta Quest 3 gaming and conferencing application

Arxiv

0+阅读 · 2024年7月29日

Physically-based Path Tracer using WebGPU and OpenPBR

Arxiv

0+阅读 · 2024年7月29日

Personality testing of Large Language Models: Limited temporal stability, but highlighted prosociality

Arxiv

0+阅读 · 2024年7月28日

Java-Class-Hijack: Software Supply Chain Attack for Java based on Maven Dependency Resolution and Java Classloading

Arxiv

0+阅读 · 2024年7月26日

Goodness-of-Fit and Clustering of Spherical Data: the QuadratiK package in R and Python

Arxiv

0+阅读 · 2024年7月25日

BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social

Arxiv

0+阅读 · 2024年7月24日

Reliability on QR codes and Reed-Solomon codes

Arxiv

0+阅读 · 2024年7月24日

PyBench: Evaluating LLM Agent on various real-world coding tasks

Arxiv

0+阅读 · 2024年7月23日

IDA: Breaking Barriers in No-code UI Automation Through Large Language Models and Human-Centric Design

Arxiv

0+阅读 · 2024年7月22日

Characteristics of ChatGPT users from Germany: implications for the digital divide from web tracking data

Arxiv

0+阅读 · 2024年7月21日

VIP会员

文章信息

相关主题

大语言模型

相关VIP内容

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能模型风险目录：开发者与研究者对现实世界AI危害的认知盲区》

《印美国防合作：“自力更生”计划》最新126页报告

构建新大脑：将军事院校转型为AI作战实验室

《革命性软件智能：融合神经程序合成、量子安全运维与可解释人工智能的下一代自主系统统一框架》最新报告

相关资讯

RL解决'BipedalWalkerHardcore-v2' (SOTA)效率效果均第一及完整源代码

RL解决'BipedalWalkerHardcore-v2' (SOTA)效率效果均第一及完整源代码

CreateAMind

19+阅读 · 2019年7月18日

RL解决'BipedalWalkerHardcore-v2' (SOTA)

RL解决'BipedalWalkerHardcore-v2' (SOTA)

CreateAMind

31+阅读 · 2019年7月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

CVE-2018-7600 - Drupal 7.x 远程代码执行exp

CVE-2018-7600 - Drupal 7.x 远程代码执行exp

黑客工具箱

14+阅读 · 2018年4月17日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

相关论文

On the impact of VR/AR applications on optical transport networks: First experiments with Meta Quest 3 gaming and conferencing application

Arxiv

0+阅读 · 2024年7月29日

Physically-based Path Tracer using WebGPU and OpenPBR

Arxiv

0+阅读 · 2024年7月29日

Personality testing of Large Language Models: Limited temporal stability, but highlighted prosociality

Arxiv

0+阅读 · 2024年7月28日

Java-Class-Hijack: Software Supply Chain Attack for Java based on Maven Dependency Resolution and Java Classloading

Arxiv

0+阅读 · 2024年7月26日

Goodness-of-Fit and Clustering of Spherical Data: the QuadratiK package in R and Python

Arxiv

0+阅读 · 2024年7月25日

BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social

Arxiv

0+阅读 · 2024年7月24日

Reliability on QR codes and Reed-Solomon codes

Arxiv

0+阅读 · 2024年7月24日

PyBench: Evaluating LLM Agent on various real-world coding tasks

Arxiv

0+阅读 · 2024年7月23日

IDA: Breaking Barriers in No-code UI Automation Through Large Language Models and Human-Centric Design

Arxiv

0+阅读 · 2024年7月22日

Characteristics of ChatGPT users from Germany: implications for the digital divide from web tracking data

Arxiv

0+阅读 · 2024年7月21日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

细菌信使RNA非翻译区衍生小RNA的生物学功能研究

国家自然科学基金

1+阅读 · 2015年12月31日

高频ZnO/IDT/SiO2/金刚石SAW乳腺癌抗原免疫传感器研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

Android移动终端多语种基础软件组合的安全技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

基于UGC的应急响应决策支持系统关键技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员