GitHub has become the central online platform for much of open source, hosting most open source code repositories. With this popularity, the public digital traces of GitHub are now a valuable means to study teamwork and collaboration. In many ways, however, GitHub is a convenience sample, and may not be representative of open source development off the platform. Here we develop a novel, extensive sample of public open source project repositories outside of centralized platforms. We characterized these projects along a number of dimensions, and compare to a time-matched sample of corresponding GitHub projects. Our sample projects tend to have more collaborators, are maintained for longer periods, and tend to be more focused on academic and scientific problems.
翻译:GitHub已成为许多开放源码库的中央在线平台,托管了大多数开放源码库。有了这种受欢迎程度,GitHub的公开数字痕迹现在成为研究团队协作与合作的宝贵手段。然而,在许多方面,GitHub是一个方便的样本,可能不能代表平台外的开放源码开发。在这里,我们开发了一个新型的、广泛的公共开放源码项目库样本,在中央平台外进行。我们按多个层面对这些项目进行了描述,并比较了相应的GitHub项目具有与时间相称的样本。我们的样本项目往往拥有更多的合作者,并维持了更长的时间,而且往往更侧重于学术和科学问题。