Organisations use issue tracking systems (ITSs) to store their project documentation in unit-like pieces called "issues". This style of documentation encourages evolutionary refinement, as each issue can be independently: improved, commented on, linked to other issues, and progressed through the organisational workflow. Common ITSs studied include GitHub, GitLab, and Bugzilla; however, these issue trackers have been the subject of much research, while Jira, a wealth of information with additional benefits, has yet to receive such attention. Unfortunately, no public dataset of Jira repositories exists, likely due to the difficulty in finding and accessing these repositories. With this dataset paper, we release a dataset of 16 public Jiras with 1822 projects, spanning 2.7 million issues with a combined total of 32 million changes, 8 million comments, and 1 million issue links. We believe this Jira dataset will lead to many fruitful research projects investigating issue evolution, issue linking, cross-project analysis, and cross-tool analysis with the existing well-studied ITSs listed above.
翻译:各组织使用问题跟踪系统(ITS)将其项目文件储存在单位类文件“问题”中。这种文件风格鼓励逐步完善,因为每个问题都可以独立地进行:改进、评论、与其他问题相联系,并通过组织工作流程取得进展。所研究的共同ITS包括GitHub、GitLab和Bugzilla;然而,这些问题跟踪器一直是许多研究的对象,而具有额外益处的丰富信息丰富的Jira还没有得到这样的关注。不幸的是,由于很难找到和访问这些储存库,Jira储存库没有公开数据集。有了这一数据集,我们发布了16个具有1822个项目的Jiras公共数据集,涉及270万个问题,总共3 200万个修改、800万评论和100万个问题链接。我们认为,这个Jira数据集将导致许多富有成果的研究项目,调查问题的演变、问题的联系、跨项目分析以及交叉工具分析与上面列出的现有研究良好的ITS的交叉分析。