We investigate the feasibility of using mixtures of interpretable experts (MoIE) to build interpretable image classifiers on MNIST10. MoIE uses a black-box router to assign each input to one of many inherently interpretable experts, thereby providing insight into why a particular classification decision was made. We find that a naively trained MoIE will learn to 'cheat', whereby the black-box router will solve the classification problem by itself, with each expert simply learning a constant function for one particular class. We propose to solve this problem by introducing interpretable routers and training the black-box router's decisions to match the interpretable router. In addition, we propose a novel implicit parameterization scheme that allows us to build mixtures of arbitrary numbers of experts, allowing us to study how classification performance, local and global interpretability vary as the number of experts is increased. Our new model, dubbed Implicit Mixture of Interpretable Experts (IMoIE) can match state-of-the-art classification accuracy on MNIST10 while providing local interpretability, and can provide global interpretability albeit at the cost of reduced classification accuracy.
翻译:我们调查使用可解释专家(MOIE)混合物在MNIST10上建立可解释的图像分类器的可行性。 MOIE使用黑盒路由器将每份输入指派给许多内在可解释专家之一,从而深入了解为什么作出某项分类决定。 我们发现,经过天真的培训的MOIE将学会“热”,黑盒路由器将自行解决分类问题,每位专家只需学习某一特定类的常数功能即可解决该问题。 我们提议通过引入可解释路由器和培训黑盒路由器的决定与可解释路由器匹配来解决这一问题。 此外,我们提出了一个新的隐含参数化计划,允许我们建立任意专家人数的混合,使我们能够研究随着专家数量的增加,分类性能、本地和全球可解释性会如何不同。 我们的新模型,即隐含易用专家(IMIEIE)的隐含的隐含混混混组合,可以匹配MNIST10 的状态- 分类精度,同时提供本地的可解释性,并提供全球可解释性,尽管分类成本降低。