Enhancing the modular structure of existing systems has attracted substantial research interest, focusing on two main methods: (1) software modularization and (2) identifying design issues (e.g., smells) as refactoring opportunities. However, re-modularization solutions often require extensive modifications to the original modules, and the design issues identified are generally too coarse to guide refactoring strategies. Combining the above two methods, this paper introduces a novel concept, PairSmell, which exploits modularization to pinpoint design issues necessitating refactoring. We concentrate on a granular but fundamental aspect of modularity principles -- modular relation (MR), i.e., whether a pair of entities are separated or collocated. The main assumption is that, if the actual MR of a pair violates its `apt MR', i.e., an MR agreed on by multiple modularization tools (as raters), it can be deemed likely a flawed architectural decision that necessitates further examination. To quantify and evaluate PairSmell, we conduct an empirical study on 20 C/C++ and Java projects, using 4 established modularization tools to identify two forms of PairSmell: inapt separated pairs InSep and inapt collocated pairs InCol. Our study on 260,003 instances reveals that their architectural impacts are substantial: (1) on average, 14.60% and 20.44% of software entities are involved in InSep and InCol MRs respectively; (2) InSep pairs are associated with 190% more co-changes than properly separated pairs, while InCol pairs are associated with 35% fewer co-changes than properly collocated pairs, both indicating a successful identification of modular structures detrimental to software quality; and (3) both forms of PairSmell persist across software evolution.
翻译:暂无翻译