We present a new approach to classification that combines data and knowledge. In this approach, data mining is used to derive association rules (possibly with negations) from data. Those rules are leveraged to increase the predictive performance of tree-based models (decision trees and random forests) used for a classification task. They are also used to improve the corresponding explanation task through the generation of abductive explanations that are more general than those derivable without taking such rules into account. Experiments show that for the two tree-based models under consideration, benefits can be offered by the approach in terms of predictive performance and in terms of explanation sizes.
翻译:本文提出一种结合数据与知识的新型分类方法。该方法通过数据挖掘从数据中推导出关联规则(可能包含否定形式),并利用这些规则提升基于树的模型(决策树与随机森林)在分类任务中的预测性能。同时,该方法通过生成溯因解释来改进相应的解释任务——相较于未考虑此类规则所推导的解释,所得解释具有更高的泛化性。实验表明,对于所研究的两种基于树的模型,该方法在预测性能和解释规模方面均能带来显著提升。