FOLD-R is an automated inductive learning algorithm for learning default rules with exceptions for mixed (numerical and categorical) data. It generates an (explainable) answer set programming (ASP) rule set for classification tasks. We present an improved FOLD-R algorithm, called FOLD-R++, that significantly increases the efficiency and scalability of FOLD-R. FOLD-R++ improves upon FOLD-R without compromising or losing information in the input training data during the encoding or feature selection phase. The FOLD-R++ algorithm is competitive in performance with the widely-used XGBoost algorithm, however, unlike XGBoost, the FOLD-R++ algorithm produces an explainable model. Next, we create a powerful tool-set by combining FOLD-R++ with s(CASP)-a goal-directed ASP execution engine-to make predictions on new data samples using the answer set program generated by FOLD-R++. The s(CASP) system also produces a justification for the prediction. Experiments presented in this paper show that our improved FOLD-R++ algorithm is a significant improvement over the original design and that the s(CASP) system can make predictions in an efficient manner as well.
翻译:FOLD-R+是一个用于学习默认规则的自动感化学习算法,但混合(数字和绝对)数据除外。它产生一个(可解释的)回答数据集编程(ASP)规则,用于分类任务。我们提出了一个改进的FOLD-R算法,称为FOLD-R++,大大提高了FOLD-R-R+的效率和可伸缩性。FOLD-R++在编码或特征选择阶段对FOLD-R-R进行改进,同时不损及或丢失输入培训数据中的信息。FOLD-R++在运行过程中具有竞争性,但与广泛使用的 XGBOost 算法不同,FOLD-R++ 算法产生了一个可解释的模式。接下来,我们通过将FOLD-R+++与S(CASP)-一个目标导向的ASP执行引擎结合起来,建立一个强大的工具集集集,可以使用FOLD-R++生成的答案集成程序对新的数据样品作出预测。SASP系统也为预测提供了理由。在本文上所作的实验显示,我们改进后FOLD-R++ASP的系统可以对原设计作很好的预测。