Code smells indicate software design problems that harm software quality. Data-intensive systems that frequently access databases often suffer from SQL code smells besides the traditional smells. While there have been extensive studies on traditional code smells, recently, there has been a growing interest in SQL code smells. In this paper, we conduct an empirical study to investigate the prevalence and evolution of SQL code smells in open-source, data-intensive systems. We collected 150 projects and examined both traditional and SQL code smells in these projects. Our investigation delivers several important findings. First, SQL code smells are indeed prevalent in data-intensive software systems. Second, SQL code smells have a weak co-occurrence with traditional code smells. Third, SQL code smells have a weaker association with bugs than that of traditional code smells. Fourth, SQL code smells are more likely to be introduced at the beginning of the project lifetime and likely to be left in the code without a fix, compared to traditional code smells. Overall, our results show that SQL code smells are indeed prevalent and persistent in the studied data-intensive software systems. Developers should be aware of these smells and consider detecting and refactoring SQL code smells and traditional code smells separately, using dedicated tools.
翻译:经常访问数据库的数据密集型系统除了传统气味外,还经常受到SQL代码的嗅觉。虽然最近对传统代码的嗅觉进行了广泛研究,但对SQL代码的嗅觉越来越感兴趣。在本文件中,我们进行了一项经验性研究,以调查SQL代码在开放源代码和数据密集型系统中的嗅觉的流行和演变情况。我们收集了150个项目,并检查了这些项目中的传统和SQL代码的气味。我们的调查得出了若干重要的结论。首先,SQL代码的嗅觉在数据密集型软件系统中确实很普遍。第二,SQL代码的嗅觉与传统代码的嗅觉有微弱的共同味道。第三,SQL代码的嗅觉与传统代码的嗅觉联系较弱。第四,SQL代码的嗅觉在项目开始之初就更有可能被引入,并且可能与传统代码的嗅觉相比,留在代码中没有固定的味道。总体,我们的调查结果表明,SQL代码在研究的数据密集型软件系统中的确很普遍和持久。开发者应当分别了解这些传统的嗅觉和重新定位工具。