Automated GUI testing is widely used to help ensure the quality of mobile apps. However, many GUIs require appropriate text inputs to proceed to the next page which remains a prominent obstacle for testing coverage. Considering the diversity and semantic requirement of valid inputs (e.g., flight departure, movie name), it is challenging to automate the text input generation. Inspired by the fact that the pre-trained Large Language Model (LLM) has made outstanding progress in text generation, we propose an approach named QTypist based on LLM for intelligently generating semantic input text according to the GUI context. To boost the performance of LLM in the mobile testing scenario, we develop a prompt-based data construction and tuning method which automatically extracts the prompts and answers for model tuning. We evaluate QTypist on 106 apps from Google Play and the result shows that the passing rate of QTypist is 87%, which is 93% higher than the best baseline. We also integrate QTypist with the automated GUI testing tools and it can cover 42% more app activities, 52% more pages, and subsequently help reveal 122% more bugs compared with the raw tool.
翻译:自动图形用户界面测试被广泛用于帮助确保移动应用程序的质量。 但是, 许多图形界面需要适当的文本输入才能进入下一页, 这仍然是测试范围的一个突出障碍。 考虑到有效输入( 如飞行离开、电影名称等)的多样性和语义要求, 将文本输入生成自动化是困难的。 由于预先培训的大型语言模型(LLLM)在文本生成方面取得了显著进展, 我们提议了一种基于 LLM 的名为 QTypist 的方法, 以便根据 GUI 环境智能生成语义输入文本。 为了提高 LLM 在移动测试情景中的性能, 我们开发了一种基于快速数据构建和调制数据的方法, 自动提取模型调试的提示和答案。 我们评估了来自 Google Play 的106个应用程序的 QTypist, 结果显示, QTypist 的通过率为87%, 比最佳基线高出93%。 我们还将 QTypist 与自动图形测试工具整合了42 % 的应用程序, 52% 页面, 并随后帮助将 122% 的错误 比较 。