Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time, or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task.
翻译:现代计算机视野要求处理大量数据,同时培训模型和(或)在模型部署后进行推断期间,需要处理大量数据,同时培训模型和(或)在模型部署后,需要处理大量数据; 在物理分离地点捕获和处理图像的情况越来越普遍(例如自主车辆、云计算); 此外,许多装置储存或传输数据的资源有限(例如存储空间、频道能力)。在这些假设中,丢失图像压缩对于有效增加在这种限制下收集的图像数量起着关键作用; 然而,由于在模型部署后,由于重要的语义信息可能在物理分离地点丢失,因此,在实际分离地点收集并处理图像的情况越来越普遍普遍(例如,自主车辆、频道能力等); 此外,我们可能只在培训时压缩图像,而且能够使用原始图像储存或传输数据的有限(例如,存储空间空间、频道能力等),而在这种情况下,下游模式会发生变化。 在本文中,我们分析这种现象,特别侧重于基于视觉的自主驱动的负面感知觉察到一种范式情景。 我们发现, 视觉信息的损失和变异性的变化确实存在,因此,在过程中会减少,结果会特别会减少,结果,结果会降低,结果会减少,结果,结果会降低,因此会降低速度速度速度速度变化,因此会降低。