Vision Transformers (ViTs) have achieved state-of-the-art performance on various computer vision applications. However, these models have considerable storage and computational overheads, making their deployment and efficient inference on edge devices challenging. Quantization is a promising approach to reducing model complexity, and the dyadic arithmetic pipeline can allow the quantized models to perform efficient integer-only inference. Unfortunately, dyadic arithmetic is based on the homogeneity condition in convolutional neural networks, which is not applicable to the non-linear components in ViTs, making integer-only inference of ViTs an open issue. In this paper, we propose I-ViT, an integer-only quantization scheme for ViTs, to enable ViTs to perform the entire computational graph of inference with integer arithmetic and bit-shifting, and without any floating-point arithmetic. In I-ViT, linear operations (e.g., MatMul and Dense) follow the integer-only pipeline with dyadic arithmetic, and non-linear operations (e.g., Softmax, GELU, and LayerNorm) are approximated by the proposed light-weight integer-only arithmetic methods. More specifically, I-ViT applies the proposed Shiftmax and ShiftGELU, which are designed to use integer bit-shifting to approximate the corresponding floating-point operations. We evaluate I-ViT on various benchmark models and the results show that integer-only INT8 quantization achieves comparable (or even slightly higher) accuracy to the full-precision (FP) baseline. Furthermore, we utilize TVM for practical hardware deployment on the GPU's integer arithmetic units, achieving 3.72$\sim$4.11$\times$ inference speedup compared to the FP model.
翻译:视觉转换器( Viet72 ) 在各种计算机视觉应用中达到了最先进的功能。 但是, 这些模型具有相当的存储和计算性间接费, 使得其部署和在边缘设备上有效推断具有挑战性。 量化是降低模型复杂性的一个很有希望的方法, 双曲算术管道可以使量化模型能够进行有效的整数单一推算。 不幸的是, dyadic 算术基于同级神经网络中的同质性能状况, 这不适用于 ViT 中的非线性元元组件, 使得 ViTs 的纯值直线性推断值更高。 在本文件中, 我们提议I- ViT, 一个仅对整级的平流性计算法, 使ViT, 使ViT 的直线性计算方法, 使ViT 直线性计算, 使ViT 的直线性计算法计算出整个计算法的直径直径直线性计算法。 在 I- Vi- Vial- deal- developal or- deal- developalal- developmental- deal- developmental- deal- demo- developmental- developmental- demo- demod.</s>