A hybrid model involves the cooperation of an interpretable model and a complex black box. At inference, any input of the hybrid model is assigned to either its interpretable or complex component based on a gating mechanism. The advantages of such models over classical ones are two-fold: 1) They grant users precise control over the level of transparency of the system and 2) They can potentially perform better than a standalone black box since redirecting some of the inputs to an interpretable model implicitly acts as regularization. Still, despite their high potential, hybrid models remain under-studied in the interpretability/explainability literature. In this paper, we remedy this fact by presenting a thorough investigation of such models from three perspectives: Theory, Taxonomy, and Methods. First, we explore the theory behind the generalization of hybrid models from the Probably-Approximately-Correct (PAC) perspective. A consequence of our PAC guarantee is the existence of a sweet spot for the optimal transparency of the system. When such a sweet spot is attained, a hybrid model can potentially perform better than a standalone black box. Secondly, we provide a general taxonomy for the different ways of training hybrid models: the Post-Black-Box and Pre-Black-Box paradigms. These approaches differ in the order in which the interpretable and complex components are trained. We show where the state-of-the-art hybrid models Hybrid-Rule-Set and Companion-Rule-List fall in this taxonomy. Thirdly, we implement the two paradigms in a single method: HybridCORELS, which extends the CORELS algorithm to hybrid modeling. By leveraging CORELS, HybridCORELS provides a certificate of optimality of its interpretable component and precise control over transparency. We finally show empirically that HybridCORELS is competitive with existing hybrid models, and performs just as well as a standalone black box (or even better) while being partly transparent.
翻译:混合模型涉及一个可解释模型和复杂的黑盒的合作。 推断说, 混合模型的任何输入都被分配到一个基于格子机制的可解释或复杂的部分。 这种模型对古典模型的优势有两个方面:(1) 用户对系统的透明度水平给予精确的控制,(2) 混合模型可能比一个独立的黑盒表现更好, 因为我们将一些投入转移到一个可解释模型, 隐含了正规化的作用。 尽管如此, 混合模型尽管潜力很大, 在可解释/可解释性文献中仍然未得到充分研究。 在本文中, 我们从三个角度展示对此类模型的彻底调查: 理论、 分类、 方法。 首先, 我们从可能- 近似更正( PAC) 的角度探索混合模型的概括理论。 我们的PAC保证可以比独立黑盒中的某些输入点更好。 当一个可理解性模型到达时, 一个混合模型可以比一个独立黑盒更好地运行。 其次, 我们提供一个通用的模型, 将这种模型 将这种模型推延这些模型推延了 。</s>