Compressive learning (CL) is an emerging framework that integrates signal acquisition via compressed sensing (CS) and machine learning for inference tasks directly on a small number of measurements. It can be a promising alternative to classical image-domain methods and enjoys great advantages in memory saving and computational efficiency. However, previous attempts on CL are not only limited to a fixed CS ratio, which lacks flexibility, but also limited to MNIST/CIFAR-like datasets and do not scale to complex real-world high-resolution (HR) data or vision tasks. In this paper, a novel transformer-based compressive learning framework on large-scale images with arbitrary CS ratios, dubbed TransCL, is proposed. Specifically, TransCL first utilizes the strategy of learnable block-based compressed sensing and proposes a flexible linear projection strategy to enable CL to be performed on large-scale images in an efficient block-by-block manner with arbitrary CS ratios. Then, regarding CS measurements from all blocks as a sequence, a pure transformer-based backbone is deployed to perform vision tasks with various task-oriented heads. Our sufficient analysis presents that TransCL exhibits strong resistance to interference and robust adaptability to arbitrary CS ratios. Extensive experiments for complex HR data demonstrate that the proposed TransCL can achieve state-of-the-art performance in image classification and semantic segmentation tasks. In particular, TransCL with a CS ratio of $10\%$ can obtain almost the same performance as when operating directly on the original data and can still obtain satisfying performance even with an extremely low CS ratio of $1\%$. The source codes of our proposed TransCL is available at \url{https://github.com/MC-E/TransCL/}.
翻译:压缩学习(CL)是一个新兴框架,它通过压缩遥感(CS)和机器学习直接对少量测量进行推断任务,整合了信号获取;它可以成为传统图像域法的一个大有希望的替代方法,在记忆保存和计算效率方面有很大优势;然而,以前对CL的尝试不仅局限于固定的 CS 比例,缺乏灵活性,而且局限于MNIST/CIFAR类的数据集,并且不至于扩大到复杂的真实世界高分辨率数据或愿景任务。在本文中,提出了一个新的基于变压器的压缩学习框架,其比例几乎是任意的 CS 比率。具体地说, TransCLS首先使用基于块的压缩感测战略,并提出灵活的线性预测战略,使CLS能够以高效的逐个区段方式对大型图像进行操作,而CS 任意 CLV / 的测算仍然以纯的变压主骨来执行面向不同任务头的视觉任务。我们进行充分的分析,通过高压的CLFLS 快速性变压数据分析, 显示CR 的快速性变压的运行比例,在复杂的CLS 数据分析中可以实现非常高的CLLLS 的性能性能测试,,在复杂的 CLVLLVA 和高的运行中可以获取到高的性能性能性能性变压 。