When developers investigate a new bug report, they search for similar previously fixed bug reports and discussion threads attached to them. These discussion threads convey important information about the behavior of the bug including relevant bug-fixing comments. Oftentimes, these discussion threads become extensively lengthy due to the severity of the reported bug. This adds another layer of complexity, especially if relevant bug-fixing comments intermingle with seemingly unrelated comments. To manually detect these relevant comments among various cross-cutting discussion threads can become a daunting task when dealing with high volume of bug reports. To automate this process, our focus is to initially extract and detect comments in the context of query relevance, the use of positive language, and semantic relevance. Then, we merge these comments in the form of a summary for easy understanding. Specifically, we combine Sentiment Analysis and the TextRank Model with the baseline Vector Space Model (VSM). Preliminary findings indicate that bug-fixing comments tend to be positive and there exists a semantic relevance with comments from other cross-cutting discussion threads. The results also indicate that our combined approach improves overall ranking performance against the baseline VSM.
翻译:当开发者调查新的错误报告时,他们会搜索类似的先前固定的错误报告和与之相关的讨论线索。 这些讨论线索传递关于错误行为的重要信息, 包括相关的错误修正评论。 通常, 这些讨论线索会由于报告错误的严重性而变得非常冗长。 这增加了另一个复杂层面, 特别是如果相关的错误修正评论与似乎无关的评论混杂在一起。 要手动在各种交叉讨论线索中发现这些相关评论, 处理大量错误报告时会成为一个艰巨的任务 。 要将这一过程自动化, 我们的重点是在查询相关性、 使用正面语言和语义相关性的背景下, 提取和检测这些评论。 然后, 我们将这些评论以摘要的形式合并起来, 以便于理解 。 具体地说, 我们把传感器分析与TextRank 模型和基线矢量空间模型( VSM) 结合起来。 初步发现, 错误修正意见往往是肯定的, 并且与其他交叉讨论线索的评论存在语义相关性。 结果还表明, 我们的综合方法提高了基线VSM的总体排序性。