Rule sets are highly interpretable logical models in which the predicates for decision are expressed in disjunctive normal form (DNF, OR-of-ANDs), or, equivalently, the overall model comprises an unordered collection of if-then decision rules. In this paper, we consider a submodular optimization based approach for learning rule sets. The learning problem is framed as a subset selection task in which a subset of all possible rules needs to be selected to form an accurate and interpretable rule set. We employ an objective function that exhibits submodularity and thus is amenable to submodular optimization techniques. To overcome the difficulty arose from dealing with the exponential-sized ground set of rules, the subproblem of searching a rule is casted as another subset selection task that asks for a subset of features. We show it is possible to write the induced objective function for the subproblem as a difference of two submodular (DS) functions to make it approximately solvable by DS optimization algorithms. Overall, the proposed approach is simple, scalable, and likely to be benefited from further research on submodular optimization. Experiments on real datasets demonstrate the effectiveness of our method.
翻译:规则是高度可解释的逻辑模型,在这种模型中,用于作出决定的前提以脱离正常的形式(DNF、OR-ANDs)表示,或同等地,整个模型包括未经顺序排列的“如果当时决定”规则的集合。在本文件中,我们考虑为学习规则组合采用亚模式优化方法。学习问题是一个子选择任务,需要在所有可能的规则中选择一组来形成一个准确和可解释的规则集。我们使用了一种客观功能,即显示子模式,从而适合子模式优化技术。为了克服因处理指数化地面规则而出现的困难,搜索规则的子问题被作为另一个子集选择任务,要求有一组特性。我们表明有可能将子问题引发的目标功能写成为两种子模式功能的区别,以使它能够被DS优化算法几乎溶解。总体而言,拟议的方法简单、可缩放,并可能受益于对子模块优化的进一步研究。关于实际数据组合方法的实验展示了我们的方法的有效性。