Automatic crash reporting systems have become a de-facto standard in software development. These systems monitor target software, and if a crash occurs they send details to a backend application. Later on, these reports are aggregated and used in the development process to 1) understand whether it is a new or an existing issue, 2) assign these bugs to appropriate developers, and 3) gain a general overview of the application's bug landscape. The efficiency of report aggregation and subsequent operations heavily depends on the quality of the report similarity metric. However, a distinctive feature of this kind of report is that no textual input from the user (i.e., bug description) is available: it contains only stack trace information. In this paper, we present S3M ("extreme") -- the first approach to computing stack trace similarity based on deep learning. It is based on a siamese architecture that uses a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset. Additionally, we review the impact of stack trace trimming on the quality of the results.
翻译:自动崩溃报告系统已成为软件开发的脱facto标准。 这些系统监测目标软件, 如果发生崩溃时, 它们会将细节发送到后端应用程序。 后来, 这些报告被汇总并在开发过程中使用, 1 以便了解它是否是一个新问题或现有问题, 2 将这些错误分配给适当的开发者, 3 使应用程序的错误景观有一个总体概览。 报告汇总和随后操作的效率在很大程度上取决于报告类似度量的质量。 但是, 此类报告的一个显著特征是用户没有提供文字输入( 错误描述) : 它只包含堆放跟踪信息 。 在本文中, 我们介绍 S3M (“ extreme ”) -- -- 基于深层学习计算堆藏追踪相似性的第一个方法。 它基于一个使用 biLSTM 编码器和 完全连接的分类器来计算相似性。 我们的实验显示我们的方法优于状态艺术对开放源数据和私有的 Jetrainins 数据集的质量的影响 。