学生如何利用生成式AI进行软件测试：一项观察性研究 (How Students Use Generative AI for Software Testing: An Observational Study)

The integration of generative AI tools like ChatGPT into software engineering workflows opens up new opportunities to boost productivity in tasks such as unit test engineering. However, these AI-assisted workflows can also significantly alter the developer's role, raising concerns about control, output quality, and learning, particularly for novice developers. This study investigates how novice software developers with foundational knowledge in software testing interact with generative AI for engineering unit tests. Our goal is to examine the strategies they use, how heavily they rely on generative AI, and the benefits and challenges they perceive when using generative AI-assisted approaches for test engineering. We conducted an observational study involving 12 undergraduate students who worked with generative AI for unit testing tasks. We identified four interaction strategies, defined by whether the test idea or the test implementation originated from generative AI or the participant. Additionally, we singled out prompting styles that focused on one-shot or iterative test generation, which often aligned with the broader interaction strategy. Students reported benefits including time-saving, reduced cognitive load, and support for test ideation, but also noted drawbacks such as diminished trust, test quality concerns, and lack of ownership. While strategy and prompting styles influenced workflow dynamics, they did not significantly affect test effectiveness or test code quality as measured by mutation score or test smells.

翻译：将ChatGPT等生成式AI工具集成到软件工程工作流中，为提升单元测试工程等任务的效率开辟了新机遇。然而，这些AI辅助工作流也可能显著改变开发者的角色，引发对控制权、输出质量及学习过程的担忧，尤其对于新手开发者而言。本研究调查了具备软件测试基础知识的软件新手开发者如何与生成式AI交互以进行单元测试工程。我们的目标是考察他们使用的策略、对生成式AI的依赖程度，以及他们在使用生成式AI辅助测试工程方法时所感知的益处与挑战。我们开展了一项观察性研究，涉及12名本科生使用生成式AI完成单元测试任务。我们识别出四种交互策略，其定义依据测试想法或测试实现是源自生成式AI还是参与者。此外，我们区分了侧重于一次性或迭代式测试生成的提示风格，这些风格通常与更广泛的交互策略相一致。学生们报告了包括节省时间、减轻认知负荷以及获得测试构思支持在内的益处，但也指出了信任度降低、测试质量担忧和缺乏所有权等缺点。虽然策略和提示风格影响了工作流动态，但根据变异分数或测试异味衡量，它们并未显著影响测试有效性或测试代码质量。

相关内容

关注 7076

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日