分析学分类学中反思的视觉代表性:压缩视角 (Revisit Visual Representation in Analytics Taxonomy: A Compression Perspective)

Visual analytics have played an increasingly critical role in the Internet of Things, where massive visual signals have to be compressed and fed into machines. But facing such big data and constrained bandwidth capacity, existing image/video compression methods lead to very low-quality representations, while existing feature compression techniques fail to support diversified visual analytics applications/tasks with low-bit-rate representations. In this paper, we raise and study the novel problem of supporting multiple machine vision analytics tasks with the compressed visual representation, namely, the information compression problem in analytics taxonomy. By utilizing the intrinsic transferability among different tasks, our framework successfully constructs compact and expressive representations at low bit-rates to support a diversified set of machine vision tasks, including both high-level semantic-related tasks and mid-level geometry analytic tasks. In order to impose compactness in the representations, we propose a codebook-based hyperprior, which helps map the representation into a low-dimensional manifold. As it well fits the signal structure of the deep visual feature, it facilitates more accurate entropy estimation, and results in higher compression efficiency. With the proposed framework and the codebook-based hyperprior, we further investigate the relationship of different task features owning different levels of abstraction granularity. Experimental results demonstrate that with the proposed scheme, a set of diversified tasks can be supported at a significantly lower bit-rate, compared with existing compression schemes.

翻译：视觉分析器在物质互联网中发挥了越来越关键的作用, 大量的视觉信号必须压缩并输入到机器中。但是, 面对如此巨大的数据和带宽能力有限的能力, 现有的图像/ 视频压缩方法导致非常低质量的表达方式, 而现有的特征压缩技术无法支持多样化的视觉分析应用/任务, 而低比位比例的表达方式。在本文中, 我们提出并研究支持多种机器视觉分析任务的新问题, 即分析分类法中的信息压缩问题。通过利用不同任务之间的内在可转移性, 我们的框架成功地以低比特率构建了更精确的缩放和表达式表达方式, 以支持一套多样化的机器视觉任务, 包括高层次的语义相关任务和中等程度的大地测量分析任务。为了在表达方式上实施压缩, 我们提出一个基于代码的超强精度任务, 帮助将代表形式映射成一个低维度的多元体。由于它非常符合深的视觉特征的信号结构, 它有利于更精确的缩略性估计, 并且将结果与高压的深度的深度的模型比重度对比, 展示一个不同的结构, 。