In this paper, we propose a Collaboration of Experts (CoE) framework to pool together the expertise of multiple networks towards a common aim. Each expert is an individual network with expertise on a unique portion of the dataset, which enhances the collective capacity. Given a sample, an expert is selected by the delegator, which simultaneously outputs a rough prediction to support early termination. To fulfill this framework, we propose three modules to impel each model to play its role, namely weight generation module (WGM), label generation module (LGM) and variance calculation module (VCM). Our method achieves the state-of-the-art performance on ImageNet, 80.7% top-1 accuracy with 194M FLOPs. Combined with PWLU activation function and CondConv, CoE further achieves the accuracy of 80.0% with only 100M FLOPs for the first time. More importantly, our method is hardware friendly and achieves a 3-6x speedup compared with some existing conditional computation approaches.
翻译:在本文中,我们提议了一个专家协作框架(CoE)框架,将多个网络的专门知识汇集到一起,以实现一个共同目标。每位专家都是一个个人网络,在数据集的独特部分上拥有专门知识,从而增强集体能力。根据样本,一位专家由代表挑选,同时产生粗略预测,以支持早期终止。为了实现这一框架,我们提议三个模块,使每个模型发挥作用,即重力生成模块(WGM)、标签生成模块(LGM)和差异计算模块(VCM),我们的方法实现了图像网络的最先进性能,达到80.7%的顶级-1级精度,达到194M FLOPs。加上PWLU启动功能和Cond Conv,欧洲委员会还进一步实现了80.0%的准确性,第一次只有100M FLOPs。更重要的是,我们的方法方便硬件,与一些有条件的计算方法相比,我们的方法达到了3-6x速度。