We propose a novel training methodology -- Concept Group Learning (CGL) -- that encourages training of interpretable CNN filters by partitioning filters in each layer into concept groups, each of which is trained to learn a single visual concept. We achieve this through a novel regularization strategy that forces filters in the same group to be active in similar image regions for a given layer. We additionally use a regularizer to encourage a sparse weighting of the concept groups in each layer so that a few concept groups can have greater importance than others. We quantitatively evaluate CGL's model interpretability using standard interpretability evaluation techniques and find that our method increases interpretability scores in most cases. Qualitatively we compare the image regions that are most active under filters learned using CGL versus filters learned without CGL and find that CGL activation regions more strongly concentrate around semantically relevant features.
翻译:我们提出一种新的培训方法 -- -- 概念小组学习(CGL) -- -- 鼓励通过将每一层的可解释CNN过滤器分成概念组对可解释CNN过滤器进行培训,每个层的过滤器都受过学习单一视觉概念的培训。我们通过一种新的正规化战略实现这一目标,迫使同一组的过滤器在某一层的类似图像区域积极活动。我们另外使用一种常规化方法鼓励每一层的概念组少许的权重,以便少数概念组能比其他概念组更重要。我们用标准的可解释性评价技术对CGL的模型可解释性进行定量评估,发现我们的方法在多数情况下增加了可解释性分数。我们从质量上比较了在利用CGL所学的过滤器和没有CGL所学的过滤器所学的最活跃的图像区域,发现CGL所激活的区域更有力地集中于与语义相关的特性。