Predictions made by deep learning models are prone to data perturbations, adversarial attacks, and out-of-distribution inputs. To build a trusted AI system, it is therefore critical to accurately quantify the prediction uncertainties. While current efforts focus on improving uncertainty quantification accuracy and efficiency, there is a need to identify uncertainty sources and take actions to mitigate their effects on predictions. Therefore, we propose to develop explainable and actionable Bayesian deep learning methods to not only perform accurate uncertainty quantification but also explain the uncertainties, identify their sources, and propose strategies to mitigate the uncertainty impacts. Specifically, we introduce a gradient-based uncertainty attribution method to identify the most problematic regions of the input that contribute to the prediction uncertainty. Compared to existing methods, the proposed UA-Backprop has competitive accuracy, relaxed assumptions, and high efficiency. Moreover, we propose an uncertainty mitigation strategy that leverages the attribution results as attention to further improve the model performance. Both qualitative and quantitative evaluations are conducted to demonstrate the effectiveness of our proposed methods.
翻译:深度学习模型的预测容易受到数据扰动、对抗攻击和超出分布的输入的影响。为了构建可信的 AI 系统,精确量化预测不确定性非常关键。尽管目前的研究集中在提高不确定性量化的准确性和效率方面,但仍然需要识别不确定性源并采取措施来减轻其对预测的影响。因此,我们提出开发可解释的、可行动的贝叶斯深度学习方法,不仅能够执行准确的不确定性量化,而且还能够解释不确定性,识别其来源,并提出缓解不确定性影响的策略。具体而言,我们引入了一种基于梯度的不确定性归因方法,以识别对预测不确定性产生最大贡献的输入问题区域。相比现有方法,所提出的 UA-Backprop 具有竞争性的准确性、松散的假设和高效性。此外,我们提出了一种不确定性缓解策略,利用归因结果作为注意力进一步改善模型性能。通过质量和数量的评估,证明了我们提出方法的有效性。