一器足矣：基于强化学习的仓库级大语言模型智能体 (One Tool Is Enough: Reinforcement Learning for Repository-Level LLM Agents)

Locating the files and functions requiring modification in large open-source software (OSS) repositories is challenging due to their scale and structural complexity. Existing large language model (LLM)-based methods typically treat this as a repository-level retrieval task and rely on multiple auxiliary tools, which overlook code execution logic and complicate model control. We propose RepoNavigator, an LLM agent equipped with a single execution-aware tool-jumping to the definition of an invoked symbol. This unified design reflects the actual flow of code execution while simplifying tool manipulation. RepoNavigator is trained end-to-end via Reinforcement Learning (RL) directly from a pretrained model, without any closed-source distillation. Experiments demonstrate that RL-trained RepoNavigator achieves state-of-the-art performance, with the 7B model outperforming 14B baselines, the 14B model surpassing 32B competitors, and even the 32B model exceeding closed-source models such as Claude-3.7. These results confirm that integrating a single, structurally grounded tool with RL training provides an efficient and scalable solution for repository-level issue localization.

翻译：在大型开源软件（OSS）仓库中，由于规模庞大和结构复杂，定位需要修改的文件和函数极具挑战性。现有基于大语言模型（LLM）的方法通常将此视为仓库级检索任务，并依赖多种辅助工具，这忽视了代码执行逻辑并使模型控制复杂化。我们提出了RepoNavigator，一种配备单一执行感知工具（跳转到被调用符号的定义）的LLM智能体。这种统一设计反映了代码执行的实际流程，同时简化了工具操作。RepoNavigator通过强化学习（RL）从预训练模型直接进行端到端训练，无需任何闭源蒸馏。实验表明，经过RL训练的RepoNavigator实现了最先进的性能：7B模型优于14B基线，14B模型超越32B竞争对手，甚至32B模型也超过了Claude-3.7等闭源模型。这些结果证实，将单一、结构基础的工具与RL训练相结合，为仓库级问题定位提供了高效且可扩展的解决方案。