Software systems are designed according to guidelines and constraints defined by business rules. Some of these constraints define the allowable or required values for data handled by the systems. These data constraints usually originate from the problem domain (e.g., regulations), and developers must write code that enforces them. Understanding how data constraints are implemented is essential for testing, debugging, and software change. Unfortunately, there are no widely-accepted guidelines or best practices on how to implement data constraints. This paper presents an empirical study that investigates how data constraints are implemented in Java. We study the implementation of 187 data constraints extracted from the documentation of eight real-world Java software systems. First, we perform a qualitative analysis of the textual description of data constraints and identify four data constraint types. Second, we manually identify the implementations of these data constraints and reveal that they can be grouped into 30 implementation patterns. The analysis of these implementation patterns indicates that developers prefer a handful of patterns when implementing data constraints and deviations from these patterns are associated with unusual implementation decisions or code smells. Third, we develop a tool-assisted protocol that allows us to identify 256 additional trace links for the data constraints implemented using the 13 most common patterns. We find that almost half of these data constraints have multiple enforcing statements, which are code clones of different types.
翻译:这些数据限制通常源于问题领域(例如规章),开发者必须写出执行这些限制的代码。了解如何实施数据限制对于测试、调试和软件变化至关重要。不幸的是,对于如何实施数据限制,没有得到广泛接受的准则或最佳做法。本文件介绍了一项经验研究,调查爪哇如何实施数据限制。我们研究了从八个真实世界爪哇软件系统文件中提取的187项数据限制的执行情况。首先,我们对数据限制的文字描述进行了定性分析,并确定了四种数据限制类型。第二,我们人工确定数据限制的执行情况,并表明这些限制可以归为30种执行模式。对这些执行模式的分析表明,开发者倾向于采用少数模式来实施数据限制和偏离这些模式与不寻常的执行决定或代码嗅觉有关。第三,我们开发了一个工具辅助协议,使我们能够为所实施的数据限制确定256个额外的跟踪链接,使用13种最常见的模式。我们发现,这些模式有几乎一半的多重数据限制。