机器翻译如下：联合标记剪枝和压缩以更积极地压缩视觉Transformer模型 (Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers)

Although vision transformers (ViTs) have shown promising results in various computer vision tasks recently, their high computational cost limits their practical applications. Previous approaches that prune redundant tokens have demonstrated a good trade-off between performance and computation costs. Nevertheless, errors caused by pruning strategies can lead to significant information loss. Our quantitative experiments reveal that the impact of pruned tokens on performance should be noticeable. To address this issue, we propose a novel joint Token Pruning & Squeezing module (TPS) for compressing vision transformers with higher efficiency. Firstly, TPS adopts pruning to get the reserved and pruned subsets. Secondly, TPS squeezes the information of pruned tokens into partial reserved tokens via the unidirectional nearest-neighbor matching and similarity-based fusing steps. Compared to state-of-the-art methods, our approach outperforms them under all token pruning intensities. Especially while shrinking DeiT-tiny&small computational budgets to 35%, it improves the accuracy by 1%-6% compared with baselines on ImageNet classification. The proposed method can accelerate the throughput of DeiT-small beyond DeiT-tiny, while its accuracy surpasses DeiT-tiny by 4.78%. Experiments on various transformers demonstrate the effectiveness of our method, while analysis experiments prove our higher robustness to the errors of the token pruning policy. Code is available at https://github.com/megvii-research/TPS-CVPR2023.

翻译：尽管视觉Transformer模型在各种计算机视觉任务中展示了很有前景的结果，但其高计算成本限制了它们的实际应用。之前的技术采用了修剪冗余令牌的方法，以在性能和计算成本之间获得良好的平衡。然而，修剪策略引起的错误可能会导致显着的信息丢失。我们的定量实验表明，修剪后的令牌对性能的影响应该是显著的。为了解决这个问题，我们提出了一种新颖的联合“标记剪枝和压缩”（TPS）模块，以更高效地压缩视觉Transformer模型。首先，TPS采用修剪策略获得保留和已剪枝子集。然后，TPS通过单向最近邻匹配和基于相似性的聚合步骤将已剪枝令牌的信息挤压到部分保留令牌中。与最先进的方法相比，我们的方法在所有修剪强度下表现出色。特别是将DeiT-tiny＆small的计算预算缩小到35％时，在ImageNet分类上与基线相比，它将准确性提高了1％-6％。该方法可以加速DeiT-small的吞吐量，超过了DeiT-tiny的准确性4.78％。对各种Transformer模型的实验证明了我们方法的有效性，而分析实验则证明了我们对标记剪枝策略错误的更高鲁棒性。代码可在https://github.com/megvii-research/TPS-CVPR2023上获得。