The widespread adoption of Free/Libre and Open Source Software (FLOSS) means that the ongoing maintenance of many widely used software components relies on the collaborative effort of volunteers who set their own priorities and choose their own tasks. We argue that this has created a new form of risk that we call 'underproduction' which occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced. We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset from the Debian GNU/Linux distribution that includes 21,902 source packages and the full history of 461,656 bugs. We draw on this application to present two experiments: (1) a demonstration of how our technique can be used to identify at-risk software packages in a large FLOSS repository and (2) a validation of these results using an alternate indicator of package risk. Our analysis demonstrates both the utility of our approach and reveals the existence of widespread underproduction in a range of widely-installed software components in Debian.
翻译:广泛采用自由/利伯尔软件和开放源码软件(FLOSS)意味着,持续维护许多广泛使用的软件组件取决于自愿者的合作努力,他们确定自己的优先事项和选择自己的任务。我们争辩说,这造成了一种我们称之为“生产不足”的新风险形式,即当软件工程劳动力的供应与依赖所生产的软件的人的需求脱节时,我们就会出现这种风险。我们提出了一个概念框架,用以查明软件生产相对不足的情况,以及一种统计方法,用以将我们的框架应用于Debian GNU/利努斯发行的综合数据集,该数据集包括21,902个源包和461,656个错误的全部历史。我们利用这一应用来介绍两个实验:(1) 展示我们如何利用技术在大型FLOSS储存库中识别风险软件包,(2) 使用一个替代的包风险指标验证这些结果。我们的分析表明我们的方法的效用,并显示在Debian广泛安装的软件组件中存在广泛生产的不足现象。