Pawlak rough set and neighborhood rough set are the two most common rough set theoretical models. Pawlak can use equivalence classes to represent knowledge, but it cannot process continuous data; neighborhood rough sets can process continuous data, but it loses the ability of using equivalence classes to represent knowledge. To this end, this paper presents a granular-ball rough set based on the granular-ball computing. The granular-ball rough set can simultaneously represent Pawlak rough sets, and the neighborhood rough set, so as to realize the unified representation of the two. This makes the granular-ball rough set not only can deal with continuous data, but also can use equivalence classes for knowledge representation. In addition, we propose an implementation algorithms of granular-ball rough sets. The experimental results on benchmark datasets demonstrate that, due to the combination of the robustness and adaptability of the granular-ball computing, the learning accuracy of the granular-ball rough set has been greatly improved compared with the Pawlak rough set and the traditional neighborhood rough set. The granular-ball rough set also outperforms nine popular or the state-of-the-art feature selection methods.
翻译:Pawlak粗金刚石和邻里粗金刚石是两种最常见的粗金刚石理论模型。 Pawlak可以使用等值类来代表知识,但它不能处理连续数据; 邻里粗金刚石可以处理连续数据, 但是它丧失了使用等值类来代表知识的能力。 为此,本文展示了基于颗粒球计算结果的颗粒球粗金刚石。 颗粒球粗金刚石可以同时代表Pawlak粗金刚石和邻里粗金刚石, 从而实现两者的统一代表。 这就使得颗粒球粗金刚石不仅可以处理连续的数据, 还可以使用等值类来代表知识。 此外, 我们提出了颗粒球粗金刚石的落实算法。 基准数据集实验结果显示,由于颗粒球计算方法的坚固和适应性, 与Pawlak粗金刚石块粗金刚石和传统的社区粗金刚石相比,颗球粗金刚球的学习精度得到了极大的改进。 颗球粗金刚石也代表了9个流行或状态的特征选择方法。