As machine learning tools progress, the inevitable question arises: How can machine learning help us write better code? With significant progress being achieved in natural language processing with models like GPT-3 and Bert, the applications of natural language processing techniques to code are starting to be explored. Most of the research has been focused on automatic program repair (APR), and while the results on synthetic or highly filtered datasets are promising, such models are hard to apply in real-world scenarios because of inadequate bug localization. We propose BigIssue: a benchmark for realistic bug localization. The goal of the benchmark is two-fold. We provide (1) a general benchmark with a diversity of real and synthetic Java bugs and (2) a motivation to improve bug localization capabilities of models through attention to the full repository context. With the introduction of BigIssue, we hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
翻译:随着机器学习工具的进步,不可避免地会出现一个问题:机器学习如何能帮助我们写出更好的代码?随着GPT-3和Bert等模型在自然语言处理方面取得显著进展,自然语言处理技术对代码的应用开始得到探索。大多数研究侧重于自动程序维修(APR ), 尽管合成或高过滤数据集的结果很有希望,但是由于错误定位不足,这些模型很难应用于现实世界情景。我们提议Big Isssue:一个现实的错误定位基准。基准的目标是两重。我们提供了(1)一个总的基准,有多种真实的和合成的爪哇错误,以及(2)通过关注整个存储环境来提高模型的错误本地化能力。随着Bigissue的引入,我们希望在错误本地化中提高艺术水平,进而改善非洲同行审议机制的绩效,并增加其对现代开发周期的应用性。