An effective way to obtain different perspectives on any given topic is by conducting a debate, where participants argue for and against the topic. Here, we propose a novel debate framework for understanding and explaining a continuous image classifier's reasoning for making a particular prediction by modeling it as a multiplayer sequential zero-sum debate game. The contrastive nature of our framework encourages players to learn to put forward diverse arguments during the debates, picking up the reasoning trails missed by their opponents and highlighting any uncertainties in the classifier. Specifically, in our proposed setup, players propose arguments, drawn from the classifier's discretized latent knowledge, to support or oppose the classifier's decision. The resulting Visual Debates collect supporting and opposing features from the discretized latent space of the classifier, serving as explanations for the internal reasoning of the classifier towards its predictions. We demonstrate and evaluate (a practical realization of) our Visual Debates on the geometric SHAPE and MNIST datasets and on the high-resolution animal faces (AFHQ) dataset, along standard evaluation metrics for explanations (i.e. faithfulness and completeness) and novel, bespoke metrics for visual debates as explanations (consensus and split ratio).
翻译:暂无翻译