开放源码软件项目在数量和规模上如何变化? (How do OSS projects change in number and size? A large-scale analysis to test a model of project growth)

Established Open Source Software (OSS) projects can grow in size if new developers join, but also the number of OSS projects can grow if developers choose to found new projects. We discuss to what extent an established model for firm growth can be applied to the dynamics of OSS projects. Our analysis is based on a large-scale data set from SourceForge (SF) consisting of monthly data for 10 years, for up to 360'000 OSS projects and up to 340'000 developers. Over this time period, we find an exponential growth both in the number of projects and developers, with a remarkable increase of single-developer projects after 2009. We analyze the monthly entry and exit rates for both projects and developers, the growth rate of established projects and the monthly project size distribution. To derive a prediction for the latter, we use modeling assumptions of how newly entering developers choose to either found a new project or to join existing ones. Our model applies only to collaborative projects that are deemed to grow in size by attracting new developers. We verify, by a thorough statistical analysis, that the Yule-Simon distribution is a valid candidate for the size distribution of collaborative projects except for certain time periods where the modeling assumptions no longer hold. We detect and empirically test the reason for this limitation, i.e., the fact that an increasing number of established developers found additional new projects after 2009.

翻译：如果新的开发者加入,建立起来的开放源码软件(OSS)项目的规模可以扩大,但如果开发者选择找到新的项目,开放源码软件项目的数量也可以增加。我们讨论一个既定的促进公司增长的模式在多大程度上可以适用于开放源码软件项目的动态。我们的分析基于来自SourceForge(SF)的大规模数据集,该数据集由10年的月度数据组成,最多可达360 000个开放源码软件项目,最多可达340 000个开发者。在此期间,我们发现项目和开发者的数量呈指数增长,2009年以后单一开发者项目显著增加。我们分析了项目和开发者的月出和退出率、既定项目的增长率和每月项目规模分布。为了预测后者,我们使用新进入的开发者选择如何找到新项目或加入现有项目的模式假设。我们的模型只适用于通过吸引新开发者而被认为规模扩大的合作项目。我们通过彻底的统计分析,核实Yul-Simon分配是合作项目规模分配的有效候选人,除非在一定的时间内,我们发现新的项目在2009年的某个时间里,我们测试了新开发者的限制。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【2020新书】Kafka实战：Kafka in Action，209页pdf

专知会员服务

69+阅读 · 2020年3月9日