As one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers every day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that functionally matches the programming problem they encounter in their development activities. To study how programmers reuse code on SO during project development, we conduct a comprehensive empirical study. First, to capture the development activities of programmers, we collect 342,148 modified code snippets in commits from 793 open-source Java projects, and these modified code can reflect the programming problems encountered during development. We also collect the code snippets from 1,355,617 posts on SO. Then, we employ CCFinder to detect the code clone between the modified code from commits and the code from SO, and further analyze the code reuse when programmer solves a programming problem during development. We count the code reuse ratios of the modified code snippets in the commits of each project in different years, the results show that the average code reuse ratio is 6.32%, and the maximum is 8.38%. The code reuse ratio in project commits has increased year by year, and the proportion of code reuse in the newly established project is higher than that of old projects. We also find that some projects reuse the code snippets from many years ago. Additionally, we find that experienced developers seem to be more likely to reuse the knowledge on SO. Moreover, we find that the code reuse ratio in bug-related commits (6.67%) is slightly higher than that of in non-bug-related commits (6.59%). Furthermore, we also find that the code reuse ratio (14.44%) in Java class files that have undergone multiple modifications is more than double the overall code reuse ratio (6.32%).
翻译:作为最著名的程序员 ⁇ A 网站之一, Stack Overflow (即,SO) 每天为成千上万的开发者提供服务。 先前的工作表明, 许多开发者在找到答案( 来自 SO ) 时, 重新使用SO 上的代码片断, 从而在功能上匹配其在开发活动中遇到的编程问题 。 为了研究程序员在项目开发期间如何重新使用 SO 的代码, 我们进行了一项全面的经验研究。 首先, 为了捕捉程序员的开发活动, 我们从793个公开源爪哇项目收集了342, 148个修改的代码片断比率, 这些修改的代码比例可以反映开发过程中遇到的编程问题 。 我们还收集了1, 355,617个SO 的代码片断。 然后, 我们使用CFCC 来检测修改代码的代码, 当程序员解决开发过程中的编程问题时, 我们发现每个项目在不同的年份中, 我们发现修改的代码比例是6.32, 这些修改的结果可以反映开发过程中的平均代码的比例是 8.38 % 。 更新的代码项目中, 更新的代码在时间项目中, 更新的代码在正常项目中似乎是 。 更新的代码是 。 。 更新项目中, 更新项目中, 更新的代码是 更新的代码在时间项目中, 。 更新的代码在时间项目中, 更新的比是 。