Machine learned models often must abide by certain requirements (e.g., fairness or legal). This has spurred interested in developing approaches that can provably verify whether a model satisfies certain properties. This paper introduces a generic algorithm called Veritas that enables tackling multiple different verification tasks for tree ensemble models like random forests (RFs) and gradient boosting decision trees (GBDTs). This generality contrasts with previous work, which has focused exclusively on either adversarial example generation or robustness checking. Veritas formulates the verification task as a generic optimization problem and introduces a novel search space representation. Veritas offers two key advantages. First, it provides anytime lower and upper bounds when the optimization problem cannot be solved exactly. In contrast, many existing methods have focused on exact solutions and are thus limited by the verification problem being NP-complete. Second, Veritas produces full (bounded suboptimal) solutions that can be used to generate concrete examples. We experimentally show that Veritas outperforms the previous state of the art by (a) generating exact solutions more frequently, (b) producing tighter bounds when (a) is not possible, and (c) offering orders of magnitude speed ups. Subsequently, Veritas enables tackling more and larger real-world verification scenarios.
翻译:机器学习的模型往往必须遵守某些要求(例如公平或法律)。这激发了人们对制定方法的兴趣,可以对模型是否符合某些特性进行可核实的验证。本文采用了一种通用算法,称为Veritas,它能够解决随机森林和梯度促进决策树等树群模型的多种不同的核查任务。这种概括性与以前的工作形成对照,以前的工作完全侧重于对抗性实例生成或稳健性检查。Veritas将核查任务设计成一个通用优化问题,并引入了一个新的搜索空间代表。Veritas提供了两个关键优势。首先,它提供了在最优化问题无法完全解决时随时可以达到的下限和上限。相比之下,许多现有方法侧重于精确的解决方案,因此受到正在完成的核查问题的限制。第二,Veritas提出了全面(含基点的亚优性)解决方案,可用于生成具体实例。我们实验性地表明,Veritas将核查任务发展得优于以往的状态,方法是:(a)更经常地产生精确的解决方案,(b)在无法准确度的情况下,(a)产生更精确的、更精确的、更接近的版本的版本,当(a)无法实现和(a)更接近的版本时,(a)提供更精确的版本。