Context: GitHub hosts an impressive number of high-quality OSS projects. However, selecting "the right tool for the job" is a challenging task, because we do not have precise information about those high-quality projects. Objective: In this paper, we propose a data-driven approach to measure the level of maintenance activity of GitHub projects. Our goal is to alert users about the risks of using unmaintained projects and possibly motivate other developers to assume the maintenance of such projects. Method: We train machine learning models to define a metric to express the level of maintenance activity of GitHub projects. Next, we analyze the historical evolution of 2,927 active projects in the time frame of one year. Results: From 2,927 active projects, 16% become unmaintained in the interval of one year. We also found that Objective-C projects tend to have lower maintenance activity than projects implemented in other languages. Finally, software tools---such as compilers and editors---have the highest maintenance activity over time. Conclusions: A metric about the level of maintenance activity of GitHub projects can help developers to select open source projects.
翻译:环境: GitHub 拥有大量高质量的开放源码软件项目。 然而,选择“ 正确的工作工具” 是一项具有挑战性的任务,因为我们没有关于这些高质量项目的确切信息。 目标: 在本文件中,我们提出以数据驱动的方法来衡量GitHub 项目的维护活动水平。 我们的目标是提醒用户使用未维护的项目的风险,并可能鼓励其他开发商承担这些项目的维护工作。 方法:我们培训机器学习模型,以界定一个计量标准来显示GitHub项目的维护活动水平。 其次,我们分析了在一年的时间框架内2 927个进行中的项目的历史演变情况。结果:在2 927个执行中的项目中,16%在一年的间隔内得不到维护。 我们还发现,目标C项目维护活动往往比用其他语言执行的项目少。 最后,软件工具,如编译员和编辑,在一段时间里拥有最高的维护活动水平。结论:关于GitHub 项目维护活动水平的尺度可以帮助开发商选择开放源项目。