Pull-based development has enabled numerous volunteers to contribute to open-source projects with fewer barriers. Nevertheless, a considerable amount of pull requests (PRs) with valid contributions are abandoned by their contributors, wasting the effort and time put in by both their contributors and maintainers. To gain a more comprehensive understanding of the underlying dynamics of contributor-abandoned PRs, we conduct a mixed-methods study using both quantitative and qualitative methods. We curate a dataset consisting of 265,325 PRs including 4,450 abandoned ones from ten popular and mature GitHub projects and measure 16 features characterizing PRs, contributors, review processes, and projects. Using statistical and machine learning techniques, we observe that complex PRs, novice contributors, and lengthy reviews have a higher probability of abandonment and the rate of PR abandonment fluctuates alongside the projects' maturity or workload. To identify why contributors abandon their PRs, we also manually examine a random sample of 354 abandoned PRs. We find that the most frequent abandonment reasons are related to the obstacles faced by contributors, followed by the hurdles imposed by maintainers during the review process. Finally, we survey the top core maintainers of the study projects to understand their perspectives on dealing with PR abandonment and on our findings.
翻译:以拉动为基础的发展使许多志愿者能够以较少的障碍为开放源码项目作出贡献,然而,大量有有效贡献的拉动请求被其捐助者放弃,浪费其捐助者和维持者付出的努力和投入的时间。为了更全面地了解捐助者放弃的PR的基本动态,我们利用定量和定性方法进行混合方法研究。我们整理了一个由265 325个自愿请求组成的数据集,其中包括来自十个广受欢迎的成熟GitHub项目的4 450个被遗弃的PR,并衡量PR、捐助者、审查进程和项目的16个特点。我们采用统计和机器学习技术,发现复杂的PR、新捐助者和冗长的审查具有较高的放弃可能性,放弃PR的比例随项目到期或工作量波动。为了查明捐助者为什么放弃其PR,我们还用人工抽样审查了354个放弃的PR。我们发现,最经常放弃的原因与捐助者所面临的障碍有关,其次是审查进程中维护者设置的障碍。最后,我们通过调查其最核心的维护者对项目的看法,了解其放弃的研究结果。