Organisations use issue tracking systems (ITSs) to track and document their projects' work in units called issues. This style of documentation encourages evolutionary refinement, as each issue can be independently improved, commented on, linked to other issues, and progressed through the organisational workflow. Commonly studied ITSs so far include GitHub, GitLab, and Bugzilla, while Jira, one of the most popular ITS in practice with a wealth of additional information, has yet to receive such attention. Unfortunately, diverse public Jira datasets are rare, likely due to the difficulty in finding and accessing these repositories. With this paper, we release a dataset of 16 public Jiras with 1822 projects, spanning 2.7 million issues with a combined total of 32 million changes, 9 million comments, and 1 million issue links. We believe this Jira dataset will lead to many fruitful research projects investigating issue evolution, issue linking, cross-project analysis, as well as cross-tool analysis when combined with existing well-studied ITS datasets.
翻译:各组织使用问题跟踪系统(ITS)跟踪和记录其项目在被称为问题的单位中的工作。这种文件风格鼓励逐步完善,因为每个问题都可以独立地改进、评论、与其他问题相联系,并通过组织工作流程取得进展。迄今为止,共同研究的ITS包括GitHub、GitLab和Bugzilla,而Jira是实际中最受欢迎的ITS,拥有大量额外信息,尚未得到这样的注意。不幸的是,由于很难找到和访问这些储存库,不同的公众Jira数据集非常少见。有了这份文件,我们发布了16个公众Jiras的数据集,共有1822个项目,涉及270万个问题,总共涉及3 200万个变化、900万个评论和100万个问题链接。我们认为,这一Jira数据集将导致许多富有成果的研究项目,调查问题演变、问题连接、跨项目分析以及交叉工具分析,如果与现有的ITS数据集相结合。