Dynamic networks have been extensively explored as they can considerably improve the model's representation power with acceptable computational cost. The common practice in implementing dynamic networks is to convert given static layers into fully dynamic ones where all parameters are dynamic and vary with the input. Recent studies empirically show the trend that the more dynamic layers contribute to ever-increasing performance. However, such a fully dynamic setting 1) may cause redundant parameters and high deployment costs, limiting the applicability of dynamic networks to a broader range of tasks and models, and more importantly, 2) contradicts the previous discovery in the human brain that \textit{when human brains process an attention-demanding task, only partial neurons in the task-specific areas are activated by the input, while the rest neurons leave in a baseline state.} Critically, there is no effort to understand and resolve the above contradictory finding, leaving the primal question -- to make the computational parameters fully dynamic or not? -- unanswered. The main contributions of our work are challenging the basic commonsense in dynamic networks, and, proposing and validating the \textsc{cherry hypothesis} -- \textit{A fully dynamic network contains a subset of dynamic parameters that when transforming other dynamic parameters into static ones, can maintain or even exceed the performance of the original network.} Technically, we propose a brain-inspired partially dynamic network, namely PAD-Net, to transform the redundant dynamic parameters into static ones. Also, we further design Iterative Mode Partition to partition the dynamic- and static-subnet, which alleviates the redundancy in traditional fully dynamic networks. Our hypothesis and method are comprehensively supported by large-scale experiments with typical advanced dynamic methods.
翻译:动态网络可以大大改善模型的表达力,计算成本可以令人接受。实施动态网络的通常做法是将给定的静态层转换为完全动态的,所有参数都是动态的,并且随输入而变化。最近的研究从经验上表明,动态层有助于不断提高性能的趋势。然而,这种完全动态的设置1 可能造成冗余的参数和高部署成本,将动态网络的适用性限制在更广泛的任务和模型中,更重要的是,2 与人类大脑中以前发现的以下发现相矛盾:当人类大脑处理关注需求任务时,只有特定任务领域的一部分神经元被输入激活,而其余神经元则留在基线状态中。}关键是,没有努力理解和解决上述矛盾发现,留下原始问题,使计算参数完全动态或没有变化?我们工作的主要贡献是对动态网络中的基本常态常识,以及提出和验证的动态线下假设} -- 即使是动态网络中的部分神经变化部分神经动态网络,也就是我们革命性动态的原始网络,能够完全地将动态的动态动态网络转化为动态网络的动态变化。