Mixed-precision networks allow for a variable bit-width quantization for every layer in the network. A major limitation of existing work is that the bit-width for each layer must be predefined during training time. This allows little flexibility if the characteristics of the device on which the network is deployed change during runtime. In this work, we propose Bit-Mixer, the very first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting at all the overall network's ability for highly accurate inference. To this end, we make 2 key contributions: (a) Transitional Batch-Norms, and (b) a 3-stage optimization process which is shown capable of training such a network. We show that our method can result in mixed precision networks that exhibit the desirable flexibility properties for on-device deployment without compromising accuracy. Code will be made available.
翻译:混合精度网络允许对网络的每个层进行可变的位宽量化。 现有工作的一个主要限制是,在培训期间必须预先确定每个层的位宽。 如果网络部署的设备的特性在运行期间发生变化,这几乎没有灵活性。 在这项工作中,我们提出Bit- 混合器,这是培训元量化网络的第一种方法,在测试期间,任何层可以改变其标宽,而不会影响整个网络进行高度准确的推断的能力。 为此,我们做出两项关键贡献:(a) 过渡批次-诺姆斯,以及(b) 显示能够培训这种网络的三阶段优化进程。我们表明,我们的方法可以产生混合精准网络,在不破坏准确性的情况下展示在设备上部署所需的灵活性。