二到探戈:将视觉和文字信息相结合,用于检测重复的视频错误报告 (It Takes Two to Tango: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports)

When a bug manifests in a user-facing application, it is likely to be exposed through the graphical user interface (GUI). Given the importance of visual information to the process of identifying and understanding such bugs, users are increasingly making use of screenshots and screen-recordings as a means to report issues to developers. However, when such information is reported en masse, such as during crowd-sourced testing, managing these artifacts can be a time-consuming process. As the reporting of screen-recordings in particular becomes more popular, developers are likely to face challenges related to manually identifying videos that depict duplicate bugs. Due to their graphical nature, screen-recordings present challenges for automated analysis that preclude the use of current duplicate bug report detection techniques. To overcome these challenges and aid developers in this task, this paper presents Tango, a duplicate detection technique that operates purely on video-based bug reports by leveraging both visual and textual information. Tango combines tailored computer vision techniques, optical character recognition, and text retrieval. We evaluated multiple configurations of Tango in a comprehensive empirical evaluation on 4,860 duplicate detection tasks that involved a total of 180 screen-recordings from six Android apps. Additionally, we conducted a user study investigating the effort required for developers to manually detect duplicate video-based bug reports and compared this to the effort required to use Tango. The results reveal that Tango's optimal configuration is highly effective at detecting duplicate video-based bug reports, accurately ranking target duplicate videos in the top-2 returned results in 83% of the tasks. Additionally, our user study shows that, on average, Tango can reduce developer effort by over 60%, illustrating its practicality.

翻译：当错误出现在一个以用户为对象的应用程序中时,它可能会通过图形用户界面(GUI)暴露出来。鉴于视觉信息对识别和理解这些错误的过程的重要性,用户正在越来越多地使用截图和屏幕记录作为向开发者报告问题的手段。然而,当这种信息在质量上报告时,例如在众源测试期间,管理这些工艺品可能是一个耗时的过程。由于屏幕记录的报告特别越来越受欢迎,开发者可能面临与手动识别描述重复错误的视频有关的挑战。由于其图形性质,屏幕记录对自动分析提出了挑战,从而无法使用当前重复的错误报告探测技术。为了克服这些挑战和援助开发者,本文展示了坦戈,一种纯粹在视频错误报告中运行的重复检测技术,例如通过利用视觉和文字信息。Tango结合了基于计算机的定制视觉技术、光学字符识别和文本检索。我们在4,8,8,8,8,8的用户扫描目标序列中,我们评估了多个坦戈配置,在4,8,8,8,8的深度检测任务中,我们用最高级的检测工作展示任务,我们用了一个180的图像记录,我们需要更新了对18个图像做全面的复制。