Latency-critical computer vision systems, such as autonomous driving or drone control, require fast image or video compression when offloading neural network inference to a remote computer. To ensure low latency on a near-sensor edge device, we propose the use of lightweight encoders with constant bitrate and pruned encoding configurations, namely, ASTC and JPEG XS. Pruning introduces significant distortion which we show can be recovered by retraining the neural network with compressed data after decompression. Such an approach does not modify the network architecture or require coding format modifications. By retraining with compressed datasets, we reduced the classification accuracy and segmentation mean intersection over union (mIoU) degradation due to ASTC compression to 4.9-5.0 percentage points (pp) and 4.4-4.0 pp, respectively. With the same method, the mIoU lost due to JPEG XS compression at the main profile was restored to 2.7-2.3 pp. In terms of encoding speed, our ASTC encoder implementation is 2.3x faster than JPEG. Even though the JPEG XS reference encoder requires optimizations to reach low latency, we showed that disabling significance flag coding saves 22-23% of encoding time at the cost of 0.4-0.3 mIoU after retraining.
翻译:自动驾驶或无人驾驶控制等关键计算机视觉系统在卸载神经网络引力到远程计算机时,需要快速图像或视频压缩。为确保近传感器边缘装置的低延迟性,我们提议使用固定比特率和修剪编码配置的轻重量编码器,即ASTC和JPEG XS。 Pruning 引入了我们所显示的通过在降压后用压缩数据对神经网络进行压缩数据再培训而可以恢复的重大扭曲。这种方法并不改变网络结构,也不要求对格式进行编码修改。通过使用压缩数据集进行再培训,我们降低了分类准确性和分解意味着结合(MIOU)的交叉度。由于 ASTC压缩到4.9-5.0个百分点(pp)和4.4-4.0 pp,我们建议使用STC 压缩到4.9-5.0个百分点(pp)的轻重量编码编码器。同样的方法,我们所显示,由于主剖面的JEG XS压缩工作而损失的 mIU值恢复为2.7-2.3 pp. 。就编码速度而言,我们的ASTC编码实施速度比JPEGEG更快。即使JEX参考X值的编码参考要求在22 升降压后需要重度达到22的重度后最重度后,但显示的校正的校正的校正的校正的校正值值值为10。