Multi-label classification is becoming increasingly ubiquitous, but not much attention has been paid to interpretability. In this paper, we develop a multi-label classifier that can be represented as a concise set of simple "if-then" rules, and thus, it offers better interpretability compared to black-box models. Notably, our method is able to find a small set of relevant patterns that lead to accurate multi-label classification, while existing rule-based classifiers are myopic and wasteful in searching rules,requiring a large number of rules to achieve high accuracy. In particular, we formulate the problem of choosing multi-label rules to maximize a target function, which considers not only discrimination ability with respect to labels, but also diversity. Accounting for diversity helps to avoid redundancy, and thus, to control the number of rules in the solution set. To tackle the said maximization problem we propose a 2-approximation algorithm, which relies on a novel technique to sample high-quality rules. In addition to our theoretical analysis, we provide a thorough experimental evaluation, which indicates that our approach offers a trade-off between predictive performance and interpretability that is unmatched in previous work.
翻译:多标签分类正在变得越来越普遍,但对于可解释性却没有多少注意。在本文中,我们开发了一个多标签分类器,可以作为一套简明的简单“如果当时”规则来代表,因此,它比黑盒子模型提供更好的解释性。值得注意的是,我们的方法能够找到一小套导致准确多标签分类的相关模式,而现有的有章可循的分类器在搜索规则方面是短视和浪费的,需要大量的规则才能达到很高的准确性。特别是,我们提出了选择多标签规则的问题,以尽量扩大目标功能,不仅考虑标签上的歧视能力,而且还考虑多样性。多样性会计有助于避免冗余,从而控制解决方案中的规则数量。为了解决上述最大化问题,我们建议采用2适应性算法,该算法依靠一种新技术来抽样高质量的规则。除了我们的理论分析外,我们还提供了一个彻底的实验性评估,它表明我们的方法在预测性业绩和解释性之间提供了一种互不相称的贸易。