Whilst an abundance of techniques have recently been proposed to generate counterfactual explanations for the predictions of opaque black-box systems, markedly less attention has been paid to exploring the uncertainty of these generated explanations. This becomes a critical issue in high-stakes scenarios, where uncertain and misleading explanations could have dire consequences (e.g., medical diagnosis and treatment planning). Moreover, it is often difficult to determine if the generated explanations are well grounded in the training data and sensitive to distributional shifts. This paper proposes several practical solutions that can be leveraged to solve these problems by establishing novel connections with other research works in explainability (e.g., trust scores) and uncertainty estimation (e.g., Monte Carlo Dropout). Two experiments demonstrate the utility of our proposed solutions.
翻译:虽然最近提出了大量技术,为不透明的黑盒系统的预测提供反事实解释,但明显较少注意探讨这些解释的不确定性,这在高临界假设中成为一个关键问题,在这些假设中,不确定和误导的解释可能产生严重后果(如医疗诊断和治疗规划);此外,往往难以确定所得出的解释是否充分基于培训数据,对分配转移敏感;本文件提出若干切实可行的解决办法,可以通过与其他可解释性(如信任分数)和不确定性估计(如蒙特卡洛漏网)的研究工作建立新联系来解决这些问题。