Binary Neural Networks (BNNs) rely on a real-valued auxiliary variable W to help binary training. However, pioneering binary works only use W to accumulate gradient updates during backward propagation, which can not fully exploit its power and may hinder novel advances in BNNs. In this work, we explore the role of W in training besides acting as a latent variable. Notably, we propose to add W into the computation graph, making it perform as a real-valued feature extractor to aid the binary training. We make different attempts on how to utilize the real-valued weights and propose a specialized supervision. Visualization experiments qualitatively verify the effectiveness of our approach in making it easier to distinguish between different categories. Quantitative experiments show that our approach outperforms current state-of-the-arts, further closing the performance gap between floating-point networks and BNNs. Evaluation on ImageNet with ResNet-18 (Top-1 63.4%), ResNet-34 (Top-1 67.0%) achieves new state-of-the-art.
翻译:但是,开创性的二进制工作只能利用W在后向传播期间积累梯度更新,这不能充分利用其力量,并可能阻碍BNNs的新进展。在这项工作中,我们探索W在培训中的作用,并作为一个潜在变量发挥作用。值得注意的是,我们提议在计算图中增加W,使它作为实际价值的特征提取器来帮助二进制培训。我们尝试了不同的尝试,如何利用实际价值的重量并提出专门监督建议。可视化实验从质量上验证了我们的方法在方便区分不同类别方面的有效性。定量实验表明,我们的方法超越了目前的状况,进一步缩小了浮动点网络和BNNNs之间的性能差距。用ResNet-18(Top-163.4%)、ResNet-34(Top-67.0%)的图像网络评价实现了新的艺术状态。