Contemporary predictive models are hard to interpret as their deep nets exploit numerous complex relations between input elements. This work suggests a theoretical framework for model interpretability by measuring the contribution of relevant features to the functional entropy of the network with respect to the input. We rely on the log-Sobolev inequality that bounds the functional entropy by the functional Fisher information with respect to the covariance of the data. This provides a principled way to measure the amount of information contribution of a subset of features to the decision function. Through extensive experiments, we show that our method surpasses existing interpretability sampling-based methods on various data signals such as image, text, and audio.
翻译:现代预测模型很难解释,因为它们的深网利用了输入要素之间的许多复杂关系。这项工作通过衡量相关特征对网络输入的功能性昆虫的作用,提出了模型可解释性的理论框架。我们依靠将功能性渔业信息的功能性昆虫连接到数据共存的日志-Sobolev不平等。这为衡量一组特征对决定功能贡献的信息数量提供了原则性方法。通过广泛的实验,我们证明我们的方法超过了关于图像、文本和音频等各种数据信号的现有可解释性抽样方法。