自适应无数据量化 (Adaptive Data-Free Quantization)

Data-free quantization (DFQ) recovers the performance of quantized network (Q) without the original data, but generates the fake sample via a generator (G) by learning from full-precision network (P), which, however, is totally independent of Q, overlooking the adaptability of the knowledge from generated samples, i.e., informative or not to the learning process of Q, resulting into the overflow of generalization error. Building on this, several critical questions -- how to measure the sample adaptability to Q under varied bit-width scenarios? whether the largest adaptability is the best? how to generate the samples with adaptive adaptability to improve Q's generalization? To answer the above questions, in this paper, we propose an Adaptive Data-Free Quantization (AdaDFQ) method, which revisits DFQ from a zero-sum game perspective upon the sample adaptability between two players -- a generator and a quantized network. Following this viewpoint, we further define the disagreement and agreement samples to form two boundaries, where the margin is optimized to adaptively regulate the adaptability of generated samples to Q, so as to address the over-and-under fitting issues. Our AdaDFQ reveals: 1) the largest adaptability is NOT the best for sample generation to benefit Q's generalization; 2) the knowledge of the generated sample should not be informative to Q only, but also related to the category and distribution information of the training data for P. The theoretical and empirical analysis validate the advantages of AdaDFQ over the state-of-the-arts. Our code is available at https://github.com/hfutqian/AdaDFQ.

翻译：数据无关量化（DFQ）可以恢复量化网络（Q）的性能，而无需使用原始数据，但是会通过从全精度网络（P）中学习来生成伪样本生成器（G），其中P与Q是完全独立的，忽略了来自生成样本的知识的适应性，即对Q的学习过程是否有信息价值，从而导致广义误差的溢出。基于此，几个关键问题-如何在不同位宽情况下测量样本对Q的适应性？最大适应性是否最好？如何生成具有自适应适应性以提高Q的概括能力的样本？为了回答上述问题，在本文中，我们提出了一种Adaptive Data-Free Quantization (AdaDFQ)方法，它从生成两个玩家(生成器和量化网络)之间的样本适应性的零和博弈的角度重新审视DFQ。在这个观点下，我们进一步定义了不一致和一致的样本，形成了两个边界，其中优化的边际调节了生成的样本对Q的适应性，以解决过度和欠拟合问题。我们的AdaDFQ揭示了：1）最大适应性不是生成样本对想要提高Q的概括能力最有利的；2）生成样本的知识不仅应只与Q有关，而且还应与训练数据的类别和分布信息相关。理论和经验分析验证了AdaDFQ相对于最先进技术的优势。我们的代码可在https://github.com/hfutqian/AdaDFQ上找到。