We propose a new algorithm for training deep neural networks (DNNs) with binary weights. In particular, we first cast the problem of training binary neural networks (BiNNs) as a bilevel optimization instance and subsequently construct flexible relaxations of this bilevel program. The resulting training method shares its algorithmic simplicity with several existing approaches to train BiNNs, in particular with the straight-through gradient estimator successfully employed in BinaryConnect and subsequent methods. In fact, our proposed method can be interpreted as an adaptive variant of the original straight-through estimator that conditionally (but not always) acts like a linear mapping in the backward pass of error propagation. Experimental results demonstrate that our new algorithm offers favorable performance compared to existing approaches.
翻译:我们提出一个新的算法,用于培训具有二进制重量的深神经网络(DNNs),特别是,我们首先将培训二进制神经网络(BNNs)的问题作为一个双级优化实例,并随后为这一双级程序构建灵活的放松。由此产生的培训方法与现有的培训BINNs的几种现有方法分享其算法简单性,特别是与在Binary Connect和随后的方法中成功使用的直通梯度估计器。事实上,我们提出的方法可以被解释为一个适应性变体,即原始直通估计器的适应性变体,即有条件(但并不总是)在错误传播的落后通道上进行线性绘图。实验结果表明,我们的新算法比现有方法更有利。