Open Source Software projects add labels to open issues to help contributors choose tasks. However, manually labeling issues is time-consuming and error-prone. Current automatic approaches for creating labels are mostly limited to classifying issues as a bug/non-bug. In this paper, we investigate the feasibility and relevance of labeling issues with the domain of the APIs required to complete the tasks. We leverage the issues' description and the project history to build prediction models, which resulted in precision up to 82% and recall up to 97.8%. We also ran a user study (n=74) to assess these labels' relevancy to potential contributors. The results show that the labels were useful to participants in choosing tasks, and the API-domain labels were selected more often than the existing architecture-based labels. Our results can inspire the creation of tools to automatically label issues, helping developers to find tasks that better match their skills.
翻译:开放源码软件项目在公开问题上添加标签,以帮助贡献者选择任务。 然而, 手工标签问题耗时且容易出错。 目前创建标签的自动方法大多局限于将问题分类为错误/ 非错误。 在本文中, 我们调查了将问题与完成任务所需的API 域进行标签的可行性和相关性。 我们利用问题描述和项目历史来构建预测模型, 导致82%的精确度和97.8%的回溯。 我们还进行了用户研究( n=74), 以评估这些标签与潜在贡献者的相关性。 研究结果显示, 标签对于参与者选择任务有用, 而 API- 域标签比现有的基于架构的标签更经常被选择。 我们的结果可以激励创建自动标签问题的工具, 帮助开发者找到更符合其技能的任务 。