To evaluate software maintenance techniques and tools in controlled experiments with human participants, researchers currently use projects and tasks selected on an ad-hoc basis. This can unrealistically favor their tool, and it makes the comparison of results difficult. We suggest a gradual creation of a benchmark repository with projects, tasks, and metadata relevant for human-based studies. In this paper, we discuss the requirements and challenges of such a repository, along with the steps which could lead to its construction.
翻译:为了评估软件维护技术和工具,研究人员目前正在与人类参与者一起对受控实验中所使用的软件维护技术和工具进行评估,研究人员目前正在使用临时选择的项目和任务。这不切实际地有利于他们的工具,也难以对结果进行比较。 我们建议逐步建立一个基准存储库,其中含有与人类研究相关的项目、任务和元数据。 在本文件中,我们讨论了这样一个存储库的要求和挑战,以及可能导致其构建的步骤。