软条件计算 (Soft Conditional Computation)

Conditional computation aims to increase the size and accuracy of a network, at a small increase in inference cost. Previous hard-routing models explicitly route the input to a subset of experts. We propose soft conditional computation, which, in contrast, utilizes all experts while still permitting efficient inference through parameter routing. Concretely, for a given convolutional layer, we wish to compute a linear combination of $n$ experts $\alpha_1 \cdot (W_1 * x) + \ldots + \alpha_n \cdot (W_n * x)$, where $\alpha_1, \ldots, \alpha_n$ are functions of the input learned through gradient descent. A straightforward evaluation requires $n$ convolutions. We propose an equivalent form of the above computation, $(\alpha_1 W_1 + \ldots + \alpha_n W_n) * x$, which requires only a single convolution. We demonstrate the efficacy of our method, named CondConv, by scaling up the MobileNetV1, MobileNetV2, and ResNet-50 model architectures to achieve higher accuracy while retaining efficient inference. On the ImageNet classification dataset, CondConv improves the top-1 validation accuracy of the MobileNetV1(0.5x) model from 63.8% to 71.6% while only increasing inference cost by 27%. On COCO object detection, CondConv improves the minival mAP of a MobileNetV1(1.0x) SSD model from 20.3 to 22.4 with just a 4% increase in inference cost.

翻译：条件计算的目的是增加网络的大小和精度, 并略微提高发酵成本。以前的硬航线模型明确将输入过程引向一组专家。我们提议软条件计算, 对比之下, 使用所有专家, 但仍允许通过参数路由进行有效推断。具体地说, 对于给定的卷变层, 我们想计算一个相当的线性组合, 美元为 ALpha_ 1\ cdot (W_ 1 *x) +\ldots + alpha_ n\cdot (W_ n * *) 明确将输入过程引向一组专家。我们提议, 美元( ALpha_ 1 W_ 1 +\ sldd) +\ldotos + 71alpha_ n W_n) * od 目标 + alpha_ ocord (W_ comd)$ 0x (W_ x) 。其中, 美元为 ALphabalpha_ 1, Net_ comal liveral lifal listalal dal dalational disalation 1, 我们 lax, lax a roild Cond roild rocal d) roild a roild roild roild roild 。我们。我们 rolval 。我们。我们。我们在Sil 上, 在Slvalvalval 上, 上, 在Slval 上, 在Sild 上, 上, 在Slv 一级, 上, 在Slvalx 上, 在Sl 上, 上, 我们上, 上, 在Sild 上, 一级, 上, 上, 一级, 一级, 一级, 一级, 在Sild 一级, 在Sil 251, 在Sl 上, 上, 上, 一级, 一级, 一级, 一级, 一级, 一级, 一级, 一级, 一级, 一级, 一级, 一级, 一级, 一级,在Sl 一级,