Both uncertainty estimation and interpretability are important factors for trustworthy machine learning systems. However, there is little work at the intersection of these two areas. We address this gap by proposing a novel method for interpreting uncertainty estimates from differentiable probabilistic models, like Bayesian Neural Networks (BNNs). Our method, Counterfactual Latent Uncertainty Explanations (CLUE), indicates how to change an input, while keeping it on the data manifold, such that a BNN becomes more confident about the input's prediction. We validate CLUE through 1) a novel framework for evaluating counterfactual explanations of uncertainty, 2) a series of ablation experiments, and 3) a user study. Our experiments show that CLUE outperforms baselines and enables practitioners to better understand which input patterns are responsible for predictive uncertainty.
翻译:不确定性的估算和可解释性是值得信赖的机器学习系统的重要因素。 但是,这两个领域之间的交叉点几乎没有什么工作。 我们通过提出一种新颖的方法来解释来自不同概率模型的不确定性估计,比如巴伊西亚神经网络(BNNs ) 。 我们的方法,即反现实的迟疑性解释(CLUE ), 指明了如何改变输入,同时将其保留在数据层上, 使得一个BNN对输入的预测更有信心。 我们验证了CLUE, 1 通过 1 来验证一个用于评估不确定性的反事实解释的新框架, 2 一系列的膨胀实验, 3 用户研究。 我们的实验显示, CLUE 超越了基线, 并且让实践者更好地了解哪些输入模式对预测不确定性负有责任。