The NPM package repository contains over two million packages and serves tens of billions of downloads per-week. Nearly every single JavaScript application uses the NPM package manager to install packages from the NPM repository. NPM relies on a "semantic versioning" ('semver') scheme to maintain a healthy ecosystem, where bug-fixes are reliably delivered to downstream packages as quickly as possible, while breaking changes require manual intervention by downstream package maintainers. In order to understand how developers use semver, we build a dataset containing every version of every package on NPM and analyze the flow of updates throughout the ecosystem. We build a time-travelling dependency resolver for NPM, which allows us to determine precisely which versions of each dependency would have been resolved at different times. We segment our analysis to allow for a direct analysis of security-relevant updates (those that introduce or patch vulnerabilities) in comparison to the rest of the ecosystem. We find that when developers use semver correctly, critical updates such as security patches can flow quite rapidly to downstream dependencies in the majority of cases (90.09%), but this does not always occur, due to developers' imperfect use of both semver version constraints and semver version number increments. Our findings have implications for developers and researchers alike. We make our infrastructure and dataset publicly available under an open source license.
翻译:NPM包仓库包含超过两百万个包,并且每周提供数十亿次下载服务。几乎每个JavaScript应用程序都使用NPM包管理器从NPM仓库安装软件包。NPM依赖"语义版本控制"('semver')机制来维护一个健康的生态系统,该机制可将错误修复可靠地及时传递到下游软件包,同时,破坏性更改需要下游软件包维护者手动介入。为了了解开发人员如何使用semver,我们建立了一个数据集,其中包含NPM上每个软件包的每个版本,并分析了整个生态系统中更新的流程。我们为NPM构建了一个时光旅行的依赖关系解析器,使我们能够确定不同时间解析每个依赖项时将会使用哪些版本。我们对分析进行了分段,以便直接分析安全相关更新(引入或修复漏洞的更新)与生态系统中其余部分的比较。我们发现,当开发人员正确使用semver时,在大多数情况下(90.09%)关键更新,如安全补丁,可以相当快地流向下游依赖项,但这并不总是发生,因为开发人员对semver版本约束和semver版本号增量的不完美使用。我们的发现对开发人员和研究人员都有重要意义。我们在开源许可下公开了我们的基础设施和数据集。