Screen recordings of mobile applications are easy to obtain and capture a wealth of information pertinent to software developers (e.g., bugs or feature requests), making them a popular mechanism for crowdsourced app feedback. Thus, these videos are becoming a common artifact that developers must manage. In light of unique mobile development constraints, including swift release cycles and rapidly evolving platforms, automated techniques for analyzing all types of rich software artifacts provide benefit to mobile developers. Unfortunately, automatically analyzing screen recordings presents serious challenges, due to their graphical nature, compared to other types of (textual) artifacts. To address these challenges, this paper introduces V2S+, an automated approach for translating video recordings of Android app usages into replayable scenarios. V2S+ is based primarily on computer vision techniques and adapts recent solutions for object detection and image classification to detect and classify user gestures captured in a video, and convert these into a replayable test scenario. Given that V2S+ takes a computer vision-based approach, it is applicable to both hybrid and native Android applications. We performed an extensive evaluation of V2S+ involving 243 videos depicting 4,028 GUI-based actions collected from users exercising features and reproducing bugs from a collection of over 90 popular native and hybrid Android apps. Our results illustrate that V2S+ can accurately replay scenarios from screen recordings, and is capable of reproducing $\approx$ 90.2% of sequential actions recorded in native application scenarios on physical devices, and $\approx$ 83% of sequential actions recorded in hybrid application scenarios on emulators, both with low overhead. A case study with three industrial partners illustrates the potential usefulness of V2S+ from the viewpoint of developers.
翻译:移动应用程序的屏幕记录很容易获得和捕捉到与软件开发者相关的大量信息(例如,错误或功能请求),使这些视频成为众源物理软件反馈的流行机制。因此,这些视频正在成为开发者必须管理的一种常见工艺品。鉴于独特的移动开发限制,包括快速发布周期和快速演变的平台,分析所有种类丰富软件工艺品的自动化技术为移动开发者提供了好处。不幸的是,自动分析屏幕记录与其他类型(文字)工艺品相比,具有图形性质,因此具有严重挑战。为了应对这些挑战,本文介绍了V2S+,这是将 Android Ralder软件的视频记录转换为可再播放的场景的自动方法。V2S+,主要基于计算机视觉技术,并调整最新的物体探测和图像分类解决方案,以探测和分类在视频中捕获的用户手势。鉴于V2S+采用基于计算机的视觉应用方法,它适用于混合和本地和本地机器人应用。我们对V2S+243的视频应用,其中含有4,028 Groid Rother A-deactaldeactal动作的视频应用过程动作,从我们收集的用户的80年BIBIBS的 RestS。