Recently, dynamic inference has emerged as a promising way to reduce the computational cost of deep convolutional neural network (CNN). In contrast to static methods (e.g. weight pruning), dynamic inference adaptively adjusts the inference process according to each input sample, which can considerably reduce the computational cost on "easy" samples while maintaining the overall model performance. In this paper, we introduce a general framework, S2DNAS, which can transform various static CNN models to support dynamic inference via neural architecture search. To this end, based on a given CNN model, we first generate a CNN architecture space in which each architecture is a multi-stage CNN generated from the given model using some predefined transformations. Then, we propose a reinforcement learning based approach to automatically search for the optimal CNN architecture in the generated space. At last, with the searched multi-stage network, we can perform dynamic inference by adaptively choosing a stage to evaluate for each sample. Unlike previous works that introduce irregular computations or complex controllers in the inference or re-design a CNN model from scratch, our method can generalize to most of the popular CNN architectures and the searched dynamic network can be directly deployed using existing deep learning frameworks in various hardware devices.
翻译:最近,动态推论已经出现,成为降低深卷动神经网络计算成本的一个有希望的方法。 与静态方法(例如重量调整)相比,动态推论根据每个输入样本对推论过程进行了适应性调整,这可以大大减少“ 容易” 样本的计算成本,同时保持总体模型性能。 在本文中,我们引入了一个总框架,即S2DNAS,它可以转换各种静态CNN模型,以支持通过神经结构搜索进行动态推论。为此,我们首先根据给定的CNN模型,产生了一个CNN结构空间,其中每个结构都是使用某些预设的变异从给定模型产生的多阶段CNN结构。然后,我们提出一个基于强化的学习方法,以自动搜索生成空间中的最佳CNN结构。最后,通过搜索多阶段网络,我们可以通过适应性地选择每个样本评估的舞台来进行动态推论。 不同于以前在推论中引入不规则的计算或复杂控制器或重新设计CNN模型的模型时,我们的方法可以直接使用最动态的网络的硬件。