We present CartoonX (Cartoon Explanation), a novel model-agnostic explanation method tailored towards image classifiers and based on the rate-distortion explanation (RDE) framework. Natural images are roughly piece-wise smooth signals -- also called cartoon images -- and tend to be sparse in the wavelet domain. CartoonX is the first explanation method to exploit this by requiring its explanations to be sparse in the wavelet domain, thus extracting the \emph{relevant piece-wise smooth} part of an image instead of relevant pixel-sparse regions. We demonstrate experimentally that CartoonX is not only highly interpretable due to its piece-wise smooth nature but also particularly apt at explaining misclassifications.
翻译:我们提出CartoonX(Cartoon 解释),这是针对图像分类者并基于比例扭曲解释(RDE)框架的新型模型 -- -- 不可知解释方法。自然图像大致是片状光滑的信号 -- -- 也称为卡通图像 -- -- 并且往往在波盘域中稀释。CartoonX是利用它的第一个解释方法,要求其解释在波盘域中稀释,从而提取一个图像的一部分,而不是相关的像素稀释区。我们实验性地证明,CartoonX不仅由于其片状光滑的性质可以高度解释,而且特别能够解释错误的分类。