Software project management makes extensive use of predictive modeling to estimate product size, defect proneness and development effort. Although uncertainty is acknowledged in these tasks, fuzzy inference systems, designed to cope well with uncertainty, have received only limited attention in the software engineering domain. In this study we empirically investigate the impact of two choices on the predictive accuracy of generated fuzzy inference systems when applied to a software engineering data set: sampling of observations for training and testing; and the size of the rule set generated using fuzzy c-means clustering. Over ten samples we found no consistent pattern of predictive performance given certain rule set size. We did find, however, that a rule set compiled from multiple samples generally resulted in more accurate predictions than single sample rule sets. More generally, the results provide further evidence of the sensitivity of empirical analysis outcomes to specific model-building decisions.
翻译:软件项目管理广泛使用预测模型来估计产品规模、易变率和开发努力。虽然在这些任务中承认了不确定性,但为应付不确定性而设计的模糊推断系统在软件工程领域只得到有限的注意。在本研究中,我们实证地调查了两种选择在应用到软件工程数据集时对产生的模糊推断系统的预测准确性的影响:用于培训和测试的观测抽样;以及使用模糊的 c-poles 群集生成的规则集的规模。超过10个样本我们发现,由于某些规则设定的大小,预测性能没有一致的模式。然而,我们发现,从多个样本中收集的规则集通常比单一样本规则集产生更准确的预测。更一般而言,结果进一步证明了经验分析结果对具体的模型建设决定的敏感性。