Binary neural network leverages the $Sign$ function to binarize real values, and its non-derivative property inevitably brings huge gradient errors during backpropagation. Although many hand-designed soft functions have been proposed to approximate gradients, their mechanism is not clear and there are still huge performance gaps between binary models and their full-precision counterparts. To address this, we propose to tackle network binarization as a binary classification problem and use a multi-layer perceptron (MLP) as the classifier. The MLP-based classifier can fit any continuous function theoretically and is adaptively learned to binarize networks and backpropagate gradients without any specific soft function. With this view, we further prove experimentally that even a simple linear function can outperform previous complex soft functions. Extensive experiments demonstrate that the proposed method yields surprising performance both in image classification and human pose estimation tasks. Specifically, we achieve 65.7% top-1 accuracy of ResNet-34 on ImageNet dataset, with an absolute improvement of 2.8%. When evaluating on the challenging Microsoft COCO keypoint dataset, the proposed method enables binary networks to achieve a mAP of 60.6 for the first time, on par with some full-precision methods.
翻译:二进制神经网络将$Sign$ 功能用于使实际值的二进制化,而其非衍生属性必然会在反反向调整过程中带来巨大的梯度错误。虽然许多手工设计的软功能被提议用于接近梯度,但其机制尚不清楚,二进制模型与全精度对等模型之间还存在巨大的性能差距。为了解决这个问题,我们提议将网络的二进制作为二进制分类问题处理,并使用多层增分器(MLP)作为分解器。基于 MLP 的分类器可以在理论上适应任何连续的功能,并且适应性地学会在没有任何具体软功能的情况下使网络和后方偏斜度二进化。我们进一步实验性地证明,即使是简单的线性功能也能够超越先前的复杂软功能。广泛的实验表明,拟议的方法在图像分类和人类构成的估测算任务方面产生令人惊讶的性能。具体地说,我们在图像网络数据集上实现了ResNet-34的65.7%的顶级第一级精确度(RMLP) 。在评估具有挑战性的第一个Microsoft CO 关键点数据集时,在60- pre preal preax 上的拟议方法能够实现一个完整的60 AS ASyal ASyal as AS6的完全的某种方法使某些网络能够实现某种方法。