个人化解释性估计模型学习(ML-PIE) (Model Learning with Personalized Interpretability Estimation (ML-PIE))

High-stakes applications require AI-generated models to be interpretable. Current algorithms for the synthesis of potentially interpretable models rely on objectives or regularization terms that represent interpretability only coarsely (e.g., model size) and are not designed for a specific user. Yet, interpretability is intrinsically subjective. In this paper, we propose an approach for the synthesis of models that are tailored to the user by enabling the user to steer the model synthesis process according to her or his preferences. We use a bi-objective evolutionary algorithm to synthesize models with trade-offs between accuracy and a user-specific notion of interpretability. The latter is estimated by a neural network that is trained concurrently to the evolution using the feedback of the user, which is collected using uncertainty-based active learning. To maximize usability, the user is only asked to tell, given two models at the time, which one is less complex. With experiments on two real-world datasets involving 61 participants, we find that our approach is capable of learning estimations of interpretability that can be very different for different users. Moreover, the users tend to prefer models found using the proposed approach over models found using non-personalized interpretability indices.

翻译：高取量应用要求AI 生成模型可以解释。合成潜在可解释模型的当前算法依赖于目标或正规化术语,这些术语仅代表粗略的可解释性(例如模型大小),而且不是为特定用户设计的。然而,可解释性本质上是主观的。在本文中,我们提出了一个针对用户的模型综合方法,使用户能够根据自己或他本人的偏好来指导模型合成过程。我们使用双目标进化算法将模型与精确度和用户特有可解释性概念之间的取舍结合起来。后者由神经网络估算,通过使用用户的反馈同时培训以进行演变。使用基于不确定性的积极学习收集的反馈。为了最大限度的可用性,我们只要求用户根据当时的两个模型来说明,而这两个模型比较不那么复杂。在两个真实世界数据集上进行的实验涉及61名参与者,我们发现我们的方法能够了解对不同用户的解释性所作的估计。此外,用户倾向于采用采用拟议的方法,而不是使用非个人可解释性指数所发现的模型。