Quantization approximates a deep network model with floating-point numbers by the one with low bit width numbers, in order to accelerate inference and reduce computation. Quantizing a model without access to the original data, zero-shot quantization can be accomplished by fitting the real data distribution by data synthesis. However, zero-shot quantization achieves inferior performance compared to the post-training quantization with real data. We find it is because: 1) a normal generator is hard to obtain high diversity of synthetic data, since it lacks long-range information to allocate attention to global features; 2) the synthetic images aim to simulate the statistics of real data, which leads to weak intra-class heterogeneity and limited feature richness. To overcome these problems, we propose a novel deep network quantizer, dubbed Long-Range Zero-Shot Generative Deep Network Quantization (LRQ). Technically, we propose a long-range generator to learn long-range information instead of simple local features. In order for the synthetic data to contain more global features, long-range attention using large kernel convolution is incorporated into the generator. In addition, we also present an Adversarial Margin Add (AMA) module to force intra-class angular enlargement between feature vector and class center. As AMA increases the convergence difficulty of the loss function, which is opposite to the training objective of the original loss function, it forms an adversarial process. Furthermore, in order to transfer knowledge from the full-precision network, we also utilize a decoupled knowledge distillation. Extensive experiments demonstrate that LRQ obtains better performance than other competitors.
翻译:量化近似于一个深度网络模型,由低位宽度数字的浮动点数字组成,以加快推算和减少计算。量化模型,不使用原始数据进行量化,通过数据合成,可以实现零点量化。然而,零点量化的性能低于培训后量化的真数据。我们发现这是因为:1)正常的生成器很难获得高度多样化的合成数据,因为它缺乏关注全球特征的长程信息;2合成图像旨在模拟真实数据的统计数据,从而导致较低级内部偏差和有限特征丰富。为了克服这些问题,我们提议了一个全新的深度网络四分级测试器,与培训后四分级深层次的深层次网络(LRQ)。从技术上说,我们建议一个远程生成器来学习远程信息,而不是简单的本地特征。为了让合成数据包含更多全球特征,从而导致低位层异端异端异端数据统计,同时利用高端网络的远端递增分级递增功能,同时将当前A-RQ级级的递增中位数据转换为我们内部的递增轨道。