Although deep convolutional neural networks (CNNs) have achieved great success in computer vision tasks, its real-world application is still impeded by its voracious demand of computational resources. Current works mostly seek to compress the network by reducing its parameters or parameter-incurred computation, neglecting the influence of the input image on the system complexity. Based on the fact that input images of a CNN contain substantial redundancy, in this paper, we propose a unified framework, dubbed as ThumbNet, to simultaneously accelerate and compress CNN models by enabling them to infer on one thumbnail image. We provide three effective strategies to train ThumbNet. In doing so, ThumbNet learns an inference network that performs equally well on small images as the original-input network on large images. With ThumbNet, not only do we obtain the thumbnail-input inference network that can drastically reduce computation and memory requirements, but also we obtain an image downscaler that can generate thumbnail images for generic classification tasks. Extensive experiments show the effectiveness of ThumbNet, and demonstrate that the thumbnail-input inference network learned by ThumbNet can adequately retain the accuracy of the original-input network even when the input images are downscaled 16 times.
翻译:虽然深层进化神经网络(CNNs)在计算机视觉任务方面取得了巨大成功,但其真实世界应用仍因其对计算资源的贪婪需求而受到阻碍。目前的工作主要寻求通过减少参数或参数的计算压缩网络,忽视输入图像对系统复杂性的影响。基于一个CNN的输入图像包含大量冗余这一事实,在本文件中,我们提议一个称为缩略图网络的统一框架,以同时加速和压缩CNN模型,使其能够从一个缩略图图像中推断缩略图。我们提供了三个培训缩略图网络的有效战略。在这样做时,ThumbNet学习了一个与大图象原始输入网络一样对小图像同样良好的推论网络。与ThumbNet相比,我们不仅获得能够大幅降低计算和记忆要求的缩略图和推断网络,而且还获得一个能够生成缩略图图像用于通用分类任务的缩略图的缩略图。广泛的实验显示ShumbNet的效能,在通过缩略图图像的原始网络中可以充分保留缩略图的准确性。