Code churn and code velocity describe the evolution of a code base. Current research quantifies and studies code churn and velocity at a high level of abstraction, often at the overall project level or even at the level of an entire company. We argue that such an approach ignores noticeable differences among the subsystems of large projects. We conducted an exploratory study on four BSD family operating systems: DragonFlyBSD, FreeBSD, NetBSD, and OpenBSD. We mine 797,879 commits to characterize code churn in terms of the annual growth rate, commit types, change type ratio, and size taxonomy of commits for different subsystems (kernel, non-kernel, and mixed). We also investigate differences among various code review periods, i.e., time-to-first-response, time-to-accept, and time-to-merge, as indicators of code velocity. Our study provides empirical evidence that quantifiable evolutionary code characteristics at a global system scope fail to take into account significant individual differences that exist at a subsystem level. We found that while there exist similarities in the code base growth rate and distribution of commit types (neutral, additive, and subtractive) across BSD subsystems, (a) most commits contain kernel or non-kernel code exclusively, (b) kernel commits are larger than non-kernel commits, and (c) code reviews for kernel code take longer than non-kernel code.
翻译:代码和代码速度描述代码基的演变。当前的研究对代码库和速度进行量化,并研究高程度的抽象代码和速度,通常是在总体项目一级,甚至整个公司一级。我们争辩说,这种方法忽视了大型项目的子子系统之间的明显差异。我们研究了四个BSD家庭操作系统:DragonFlyBSD、FreeBSD、NetBSD和OpenBSD。我们用797,877979承诺从年度增长率、承诺类型、变化类型比率以及承诺的不同子系统(内核、非内核和混合)的规模分类中确定代码圈的特性。我们还调查了不同代码审查期间的差异,即:即时对一反应、时间对时间和时间对时间的计算,以及时间对时间对时间的计算。我们的研究提供了经验证据表明,全球系统范围内的可量化的进化代码特性没有考虑到子系统一级存在的重大差异。我们发现,虽然在代码基础系统(内、非同级、内核、内核、内核、内核、内核(内核、内核、内核、内核、内核、内核)的变等等等等等等系统都具有非相似性、内载的系统(内、内、内较内等等等等等等的系统、内,内载的系统-内载的变式、内等等等等等等的编码-内、内,内、内,内载的代数-内载的内,内或等)较较较较较较较较较较较较较较较较较较等等等。