Multi-exit architectures consist of a backbone and branch classifiers that offer shortened inference pathways to reduce the run-time of deep neural networks. In this paper, we analyze different branching patterns that vary in their allocation of computational complexity for the branch classifiers. Constant-complexity branching keeps all branches the same, while complexity-increasing and complexity-decreasing branching place more complex branches later or earlier in the backbone respectively. Through extensive experimentation on multiple backbones and datasets, we find that complexity-decreasing branches are more effective than constant-complexity or complexity-increasing branches, which achieve the best accuracy-cost trade-off. We investigate a cause by using knowledge consistency to probe the effect of adding branches onto a backbone. Our findings show that complexity-decreasing branching yields the least disruption to the feature abstraction hierarchy of the backbone, which explains the effectiveness of the branching patterns.
翻译:多输出结构由骨干和分支分类组成,提供缩短的推断路径,以减少深神经网络的运行时间。 在本文中,我们分析了不同分支模式,它们对于分支分类者计算复杂程度的分配各不相同。 常复分枝使所有分支保持相同, 而复杂程度的增加和复杂程度的减少使脊柱的分支分别较晚或更早地处于更复杂的分支中。 通过对多个骨干和数据集的广泛实验,我们发现复杂度下降分支比常态的复杂度或复杂度增加的分支更有效,这些分支实现了最佳的准确性-成本-权衡。 我们通过使用知识一致性来调查将分支添加到骨干上的效果来调查一个原因。 我们的调查结果显示,复杂度下降的分支会最小地扰乱骨干特征的抽象结构,从而解释分支模式的有效性。