Binary neural networks (BNNs) represent original full-precision weights and activations into 1-bit with sign function. Since the gradient of the conventional sign function is almost zero everywhere which cannot be used for back-propagation, several attempts have been proposed to alleviate the optimization difficulty by using approximate gradient. However, those approximations corrupt the main direction of factual gradient. To this end, we propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs, namely frequency domain approximation (FDA). The proposed approach does not affect the low-frequency information of the original sign function which occupies most of the overall energy, and high-frequency coefficients will be ignored to avoid the huge computational overhead. In addition, we embed a noise adaptation module into the training phase to compensate the approximation error. The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy. Code will be available at \textit{https://gitee.com/mindspore/models/tree/master/research/cv/FDA-BNN}.
翻译:二进神经网络(BNNS) 代表原始全精度加权和激活, 并带有符号功能。 由于常规标志功能的梯度几乎是零, 到处几乎是零, 无法用于反反向调整, 提议了几次尝试来减轻优化难度, 使用近似梯度。 但是, 这些近似会腐蚀实际梯度的主要方向。 为此, 我们提议使用培训BNS的正弦函数组合, 即频率域近似( FDA) 来估计四流频域的标志功能的梯度。 提议的方法不会影响原始标志功能的低频信息, 即占整个能量大部分的信号, 高频系数将被忽略以避免巨大的计算管理。 此外, 我们将噪音适应模块嵌入培训阶段, 以弥补近似错误。 几个基准数据集和神经结构的实验表明, 使用我们的方法学习的二进网实现了状态- 。 代码将在\ textit{https://gitee. com/ mindpore/ mestre/ astrain/ train/ reskain/ researsearch.