In this work, we present BasisNet which combines recent advancements in efficient neural network architectures, conditional computation, and early termination in a simple new form. Our approach incorporates a lightweight model to preview the input and generate input-dependent combination coefficients, which later controls the synthesis of a more accurate specialist model to make final prediction. The two-stage model synthesis strategy can be applied to any network architectures and both stages are jointly trained. We also show that proper training recipes are critical for increasing generalizability for such high capacity neural networks. On ImageNet classification benchmark, our BasisNet with MobileNets as backbone demonstrated clear advantage on accuracy-efficiency trade-off over several strong baselines. Specifically, BasisNet-MobileNetV3 obtained 80.3% top-1 accuracy with only 290M Multiply-Add operations, halving the computational cost of previous state-of-the-art without sacrificing accuracy. With early termination, the average cost can be further reduced to 198M MAdds while maintaining accuracy of 80.0% on ImageNet.
翻译:在这项工作中,我们展示了将高效神经网络结构的最新进展、有条件计算和早期终止相结合的“基础网”,以简单的新形式介绍了“基础网”,我们的方法包含一个轻量模型,以预览输入并生成依赖输入的组合系数,该模型随后控制了一个更准确的专家模型的合成,以便作出最终预测。两阶段的模型综合战略可以适用于任何网络结构,两个阶段都经过联合培训。我们还表明,适当的培训食谱对于提高这种高容量神经网络的通用性至关重要。在图像网分类基准上,我们以移动网络为主的“基础网”展示了在几个强大基线上进行精确-效率交易的明显优势。具体地说,BasmNet-MobileNetV3 获得了80.3%的顶级-1准确率,只有290M 倍多盘式Add 操作,在不牺牲准确性的前提下将以往最新工艺的计算成本减半。早期终止后,平均成本可以进一步降至198MAdd,同时在图像网上保持80.0%的准确性。