Binary Neural Networks (BNNs) show promising progress in reducing computational and memory costs but suffer from substantial accuracy degradation compared to their real-valued counterparts on large-scale datasets, e.g., ImageNet. Previous work mainly focused on reducing quantization errors of weights and activations, whereby a series of approximation methods and sophisticated training tricks have been proposed. In this work, we make several observations that challenge conventional wisdom. We revisit some commonly used techniques, such as scaling factors and custom gradients, and show that these methods are not crucial in training well-performing BNNs. On the contrary, we suggest several design principles for BNNs based on the insights learned and demonstrate that highly accurate BNNs can be trained from scratch with a simple training strategy. We propose a new BNN architecture BinaryDenseNet, which significantly surpasses all existing 1-bit CNNs on ImageNet without tricks. In our experiments, BinaryDenseNet achieves 18.6% and 7.6% relative improvement over the well-known XNOR-Network and the current state-of-the-art Bi-Real Net in terms of top-1 accuracy on ImageNet, respectively.
翻译:Bin Neural 网络(BNN) 显示在降低计算和记忆成本方面取得了大有希望的进展,但与其在大型数据集(例如图像网)上的实际价值对应方相比,其精确度却大大下降。以前的工作主要侧重于减少加权和激活的量化错误,据此提出了一系列近似方法和复杂的训练技巧。在这项工作中,我们提出若干意见,对传统智慧提出了挑战。我们重新审视了一些常用的技术,如缩放因子和定制梯度,并表明这些方法在培训业绩良好的数据库方面并不至关重要。相反,我们根据所学到的见解,提出了一些针对BNNN的一些设计原则,并表明可以用简单的培训战略从零开始对高度精确的BNN 网络进行训练。我们提出了一个新的BNN 架构二元数据网,它大大超过图像网上现有的所有1位CNNN,而没有花招。在我们的实验中,BinaryDenseNet在广受人所知的XNR-Net和目前最先进的双网,分别实现了头一版图像网络的准确性的18.6%和7.6%的相对改进率。