The training of deep neural networks (DNNs) always requires intensive resources for both computation and data storage. Thus, DNNs cannot be efficiently applied to mobile phones and embedded devices, which severely limits their applicability in industrial applications. To address this issue, we propose a novel encoding scheme using {-1, +1} to decompose quantized neural networks (QNNs) into multi-branch binary networks, which can be efficiently implemented by bitwise operations (i.e., xnor and bitcount) to achieve model compression, computational acceleration, and resource saving. By using our method, users can achieve different encoding precisions arbitrarily according to their requirements and hardware resources. The proposed mechanism is highly suitable for the use of FPGA and ASIC in terms of data storage and computation, which provides a feasible idea for smart chips. We validate the effectiveness of our method on large-scale image classification (e.g., ImageNet), object detection, and semantic segmentation tasks. In particular, our method with low-bit encoding can still achieve almost the same performance as its high-bit counterparts.
翻译:深神经网络(DNN)的培训总是需要大量计算和数据存储资源,因此,DNN无法有效地应用于移动电话和嵌入装置,严重限制了其在工业应用中的应用性。为解决这一问题,我们提议采用{-1,+1}的新编码办法,将量化神经网络(QNN)分解成多部门二进制网络,通过比特操作(例如,Xnor和位数),实现模型压缩、计算加速和资源节约,可以有效地加以实施。使用我们的方法,用户可以任意地根据要求和硬件资源实现不同的编码精确度。拟议的机制非常适合在数据储存和计算方面使用FPGA和ASIC,这为智能芯片提供了一个可行的想法。我们验证了我们在大规模图像分类(例如,图像网)、对象探测和语义分割任务方面的方法的有效性。特别是,我们采用低位编码的方法仍然能够取得与其高位对等几乎相同的性。