何种GitHub议题适合Copilot处理？ (What Makes a GitHub Issue Ready for Copilot?)

AI-agents help developers in different coding tasks, such as developing new features, fixing bugs, and reviewing code. Developers can write a Github issue and assign it to an AI-agent like Copilot for implementation. Based on the issue and its related discussion, the AI-agent performs a plan for the implementation, and executes it. However, the performance of AI-agents and LLMs heavily depends on the input they receive. For instance, a GitHub issue that is unclear or not well scoped might not lead to a successful implementation that will eventually be merged. GitHub Copilot provides a set of best practice recommendations that are limited and high-level. In this paper, we build a set of 32 detailed criteria that we leverage to measure the quality of GitHub issues to make them suitable for AI-agents. We compare the GitHub issues that lead to a merged pull request versus closed pull request. Then, we build an interpretable machine learning model to predict the likelihood of a GitHub issue resulting in a merged pull request. We observe that pull requests that end up being merged are those originating from issues that are shorter, well scoped, with clear guidance and hints about the relevant artifacts for an issue, and with guidance on how to perform the implementation. Issues with external references including configuration, context setup, dependencies or external APIs are associated with lower merge rates. We built an interpretable machine learning model to help users identify how to improve a GitHub issue to increase the chances of the issue resulting in a merged pull request by Copilot. Our model has a median AUC of 72\%. Our results shed light on quality metrics relevant for writing GitHub issues and motivate future studies further investigate the writing of GitHub issues as a first-class software engineering activity in the era of AI-teammates.

翻译：AI助手能在多种编码任务中辅助开发者，例如开发新功能、修复漏洞和代码审查。开发者可撰写GitHub议题并将其分配给Copilot等AI助手进行实施。基于议题及相关讨论，AI助手会制定实施计划并执行。然而，AI助手与大型语言模型的性能高度依赖其接收的输入质量。例如，表述不清或范围界定不明的GitHub议题可能无法促成最终被合并的成功实施。GitHub Copilot提供的最佳实践建议集存在局限且较为宏观。本文构建了包含32项详细标准的评估体系，用以衡量GitHub议题是否适合AI助手处理。我们对比了促成合并拉取请求与导致关闭拉取请求的GitHub议题差异，进而构建可解释的机器学习模型来预测GitHub议题促成合并拉取请求的概率。研究发现：最终被合并的拉取请求往往源自那些篇幅较短、范围明确、对议题相关工件提供清晰指引与提示，并包含实施指导的议题；而涉及配置说明、环境设置、依赖项或外部API等外部引用的议题则与较低的合并率相关。我们构建的可解释机器学习模型可帮助用户识别如何改进GitHub议题，以提升Copilot促成合并拉取请求的成功率。该模型的中位AUC达72%。本研究揭示了撰写GitHub议题的质量评估标准，并推动未来研究将AI协作时代的议题撰写作为首要软件工程活动进行深入探索。

相关内容

GitHub

关注 88

http://GitHub.com 使用 Git 作为版本控制系统（version control system）提供在线源码托管的服务，同时是个有社交功能的开发者社区。国外类似服务： http://Bitbucket.com
http://Gitlab.com
国内类似服务：
http://Coding.net

如何做好AI研究？哈佛大学Pranav教授《AI研究经验》手册，259页pdf

专知会员服务

53+阅读 · 2025年1月5日

图怎么用RAG？北大等最新《图检索增强生成(GraphRAG)》综述

专知会员服务

54+阅读 · 2024年8月22日

【AAAI2024】公平感知的Transformer模型结构剪枝

专知会员服务

43+阅读 · 2023年12月27日

【ACMMM2021】密集对比视觉语言预训练

专知会员服务

13+阅读 · 2021年10月11日