How to aggregate information from multiple instances is a key question multiple instance learning. Prior neural models implement different variants of the well-known encoder-decoder strategy according to which all input features are encoded a single, high-dimensional embedding which is then decoded to generate an output. In this work, inspired by Choquet capacities, we propose Capacity networks. Unlike encoder-decoders, Capacity networks generate multiple interpretable intermediate results which can be aggregated in a semantically meaningful space to obtain the final output. Our experiments show that implementing this simple inductive bias leads to improvements over different encoder-decoder architectures in a wide range of experiments. Moreover, the interpretable intermediate results make Capacity networks interpretable by design, which allows a semantically meaningful inspection, evaluation, and regularization of the network internals.
翻译:如何从多个实例中汇总信息是一个关键问题的多实例学习。 先前的神经模型应用了众所周知的编码器解码器- 解码器战略的不同变量,根据这些变量,所有输入功能都编码成一个单一的、高维的嵌入器,然后进行解码以产生输出。 在这项工作中,在Choquet能力启发下,我们提出了能力网络。 与编码器解码器- 解码器不同, 能力网络产生多种可解释的中间结果, 可以在一个具有语义意义的空间中汇总, 以获得最终输出。 我们的实验显示, 执行这一简单的导入偏差导致在一系列广泛的实验中改进不同的编码码解码器- 解码器结构。 此外, 可解释的中间结果使能力网络可以通过设计来解释, 从而能够对网络的内部进行具有语义意义的检查、评估和规范。