When a software bug is reported, developers engage in a discussion to collaboratively resolve it. While the solution is likely formulated within the discussion, it is often buried in a large amount of text, making it difficult to comprehend, which delays its implementation. To expedite bug resolution, we propose generating a concise natural language description of the solution by synthesizing relevant content within the discussion, which encompasses both natural language and source code. Furthermore, to support generating an informative description during an ongoing discussion, we propose a secondary task of determining when sufficient context about the solution emerges in real-time. We construct a dataset for these tasks with a novel technique for obtaining noisy supervision from repository changes linked to bug reports. We establish baselines for generating solution descriptions, and develop a classifier which makes a prediction following each new utterance on whether or not the necessary context for performing generation is available. Through automated and human evaluation, we find these tasks to form an ideal testbed for complex reasoning in long, bimodal dialogue context.
翻译:当软件错误被报告时,开发商将参与讨论,以合作解决它。虽然解决方案很可能在讨论中形成,但往往被埋在大量文本中,难以理解,因此难以理解,从而拖延其执行。为了加快解决错误,我们提议在讨论中综合相关内容,包括自然语言和源代码,从而生成解决方案的简明自然语言描述。此外,为了支持在正在进行的讨论中生成信息描述,我们提议进行一项次要任务,即确定解决方案何时能实时出现充分的背景。我们为这些任务建立一个数据集,采用新颖技术,从与错误报告相联系的存储器变化中获得响亮的监管。我们为生成解决方案描述建立基线,并开发一个分类器,在每次新的发言后,对是否具备进行生成所需的背景进行预测。通过自动化和人文评估,我们发现这些任务将形成一个理想的测试台,用于长期、双式对话背景下的复杂推理。