The ever-growing computational demands of increasingly complex machine learning models frequently necessitate the use of powerful cloud-based infrastructure for their training. Binary neural networks are known to be promising candidates for on-device inference due to their extreme compute and memory savings over higher-precision alternatives. In this paper, we demonstrate that they are also strongly robust to gradient quantization, thereby making the training of modern models on the edge a practical reality. We introduce a low-cost binary neural network training strategy exhibiting sizable memory footprint reductions and energy savings vs Courbariaux & Bengio's standard approach. Against the latter, we see coincident memory requirement and energy consumption drops of 2--6$\times$, while reaching similar test accuracy in comparable time, across a range of small-scale models trained to classify popular datasets. We also showcase ImageNet training of ResNetE-18, achieving a 3.12$\times$ memory reduction over the aforementioned standard. Such savings will allow for unnecessary cloud offloading to be avoided, reducing latency, increasing energy efficiency and safeguarding privacy.
翻译:日益复杂的机器学习模式日益增长的计算需求往往需要使用强大的云基基础设施进行培训。二元神经网络由于在高精度替代品方面极端的计算和记忆节约,因此被认为是有希望的在线推导对象。在本文件中,我们表明它们对于梯度量化也非常强大,从而使在边缘对现代模型的培训成为现实。我们引入了低成本的二元神经网络培训战略,展示了巨大的记忆足迹减少和节能,与Courbariaux和Bengio的标准方法相比。相对于后者,我们看到了同步的记忆要求和能源消耗下降2-6美元,同时在可比时间达到类似的测试精度,经过培训对流行数据集进行分类的小规模模型。我们还展示了ResNetE-18的图像网络培训,在前述标准上实现了3.12美元的记忆减少。这种节省将使得不必要的云量减少,降低拉特,提高能源效率,保护隐私。