Context: Pull-based development model is widely used in open source, leading the trends in distributed software development. One aspect which has garnered significant attention is studies on pull request decision - identifying factors for explanation. Objective: This study builds on a decade long research on pull request decision to explain it. We empirically investigate how factors influence pull request decision and scenarios that change the influence of factors. Method: We identify factors influencing pull request decision on GitHub through a systematic literature review and infer it by mining archival data. We collect a total of 3,347,937 pull requests with 95 features from 11,230 diverse projects on GitHub. Using this data, we explore the relations of the factors to each other and build mixed-effect logistic regression models to empirically explain pull request decision. Results: Our study shows that a small number of factors explain pull request decision with the integrator same or different from the submitter as the most important factor. We also noted that some factors are important only in special cases e.g., the percentage of failed builds is important for pull request decision when continuous integration is used.
翻译:背景:基于拉动的发展模式在开放源码中广泛使用,引导分布式软件开发的趋势。一个引起极大关注的方面是,对拉动请求决定的研究----找出解释因素。目标:本研究基于对拉动请求决定进行长达十年的长期研究,以解释这一决定。我们实证地调查各种因素如何影响拉动请求决定以及改变因素影响因素影响的情景。方法:我们通过系统的文献审查确定影响GitHub的拉动请求决定的因素,并通过采矿档案数据推断出这些因素。我们从11,230个关于GitHub的项目中共收集了3,347,937个有95个特点的拉动请求。我们利用这一数据,探索各种因素之间的关系,并建立混合效应后勤回归模型,以实证方式解释拉动请求决定。结果:我们的研究显示,少数因素解释拉动请求决定,与提交者相同,或与提交者不同,作为最重要的因素。我们还注意到,某些因素只有在特殊情况下才很重要,例如,在使用连续整合时,失败率对于拉动决定很重要。