To explore the prevalence of abrupt changes (changepoints) in open source project activity, we assembled a dataset of 8,919 projects from the World of Code. Projects were selected based on age, number of commits, and number of authors. Using the nonparametric PELT algorithm, we identified changepoints in project activity time series, finding that more than 90% of projects had between one and six changepoints. Increases and decreases in project activity occurred with roughly equal frequency. While most changes are relatively small, on the order of a few authors or few dozen commits per month, there were long tails of much larger project activity changes. In future work, we plan to focus on larger changes to search for common open source lifecycle patterns as well as common responses to external events.
翻译:为了探索开放源码项目活动突然变化(变化点)的普遍程度,我们从代码世界中收集了8,919个项目的数据集。项目是根据年龄、承诺数目和作者数目选择的。我们使用非参数的PELT算法确定了项目活动时间序列的变化点,发现90%以上的项目有1至6个变化点,项目活动的增减频率大致相同。虽然大多数变化相对小,按每月几个作者或几十个承诺的顺序排列,但项目活动变化大得多,有很长的尾巴。在今后的工作中,我们计划把重点放在更大的变化上,以寻找共同的开放源生命周期模式以及对外部事件的共同反应。