In software development teams, developer turnover is among the primary reasons for project failures as it leads to a great void of knowledge and strain for the newcomers. Unfortunately, no established methods exist to measure how knowledge is distributed among development teams. Knowing how this knowledge evolves and is owned by key developers in a project helps managers reduce risks caused by turnover. To this end, this paper introduces a novel, realistic representation of domain knowledge distribution: the ConceptRealm. To construct the ConceptRealm, we employ a latent Dirichlet allocation model to represent textual features obtained from 300k issues and 1.3M comments from 518 open-source projects. We analyze whether the newly emerged issues and developers share similar concepts or how aligned the developers' concepts are with the team over time. We also investigate the impact of leaving members on the frequency of concepts. Finally, we evaluate the soundness of our approach to closed-source software, thus allowing the validation of the results from a practical standpoint. We find out that the ConceptRealm can represent the high-level domain knowledge within a team and can be utilized to predict the alignment of developers with issues. We also observe that projects exhibit many keepers independent of project maturity and that abruptly leaving keepers harm the team's concept familiarity.
翻译:在软件开发团队中,开发商更替是造成项目失败的主要原因之一,因为它导致对新到者的知识和压力的极大空白。不幸的是,没有既定的方法来衡量如何在开发团队中分配知识。知道这种知识是如何演变的,并且是一个项目的关键开发者拥有这种知识,这有助于管理人员减少由更替造成的风险。为此,本文件介绍了对域知识分配的新的、现实的描述:概念Realm。为了构建概念Realm,我们使用潜伏的Drichlet分配模式来代表300k问题的文本性特征和518个开放源码项目的1.3M评论。我们分析新出现的问题和开发者是否共享类似的概念,或者开发者的概念如何与团队在一段时间内保持一致。我们还调查让成员留在概念频率上的影响。最后,我们评估了我们使用封闭源码软件的方法的正确性,从而能够从实际的角度验证结果。我们发现,概念Realem可以代表一个团队中的高级域知识,并且可以用来预测开发者与问题的一致性。我们还注意到,项目显示,许多项目保持了投资者的熟悉度,使项目团队的成熟度保持突然的成熟性。