Many empirical studies have demonstrated the performance benefits of conditional computation in neural networks, including reduced inference time and power consumption. We study the fundamental limits of neural conditional computation from the perspective of memorization capacity. For Rectified Linear Unit (ReLU) networks without conditional computation, it is known that memorizing a collection of $n$ input-output relationships can be accomplished via a neural network with $O(\sqrt{n})$ neurons. Calculating the output of this neural network can be accomplished using $O(\sqrt{n})$ elementary arithmetic operations of additions, multiplications and comparisons for each input. Using a conditional ReLU network, we show that the same task can be accomplished using only $O(\log n)$ operations per input. This represents an almost exponential improvement as compared to networks without conditional computation. We also show that the $\Theta(\log n)$ rate is the best possible. Our achievability result utilizes a general methodology to synthesize a conditional network out of an unconditional network in a computationally-efficient manner, bridging the gap between unconditional and conditional architectures.
翻译:许多实证研究已经证明了神经网络中条件计算的性能优势,包括减少推理时间和功率消耗。我们从记忆能力的角度研究了神经条件计算的基本限制。对于没有条件计算的修正线性单元(ReLU)网络,已知可以通过具有$O(\sqrt{n})$神经元的神经网络来记忆一组$n$个输入输出关系。计算此神经网络的输出可以使用每个输入$O(\sqrt{n})$次加法、乘法和比较的基本算术运算来完成。使用条件ReLU网络,我们表明同样的任务可以使用每个输入仅需$O(\log n)$次操作来完成。这代表着相对于没有条件计算的网络,几乎指数级地提高了神经条件计算的性能。我们还表明$\Theta(\log n)$速率是最佳的可能性。我们的可行性结果利用了一种通用方法,在计算高效的条件网络中从无条件网络中合成条件网络。