Image resolution has a significant effect on the accuracy and computational, storage, and bandwidth costs of computer vision model inference. These costs are exacerbated when scaling out models to large inference serving systems and make image resolution an attractive target for optimization. However, the choice of resolution inherently introduces additional tightly coupled choices, such as image crop size, image detail, and compute kernel implementation that impact computational, storage, and bandwidth costs. Further complicating this setting, the optimal choices from the perspective of these metrics are highly dependent on the dataset and problem scenario. We characterize this tradeoff space, quantitatively studying the accuracy and efficiency tradeoff via systematic and automated tuning of image resolution, image quality and convolutional neural network operators. With the insights from this study, we propose a dynamic resolution mechanism that removes the need to statically choose a resolution ahead of time.
翻译:图像分辨率对计算机视觉模型推算的准确性和计算、存储和带宽成本有重大影响。当将模型推广到大型推论服务系统并使图像分辨率成为最优化的吸引目标时,这些成本会加剧。然而,分辨率的选择必然会带来更多紧密结合的选择,如图像裁剪大小、图像细节,以及计算影响计算、存储和带宽成本的内核实施等。使这一环境更加复杂的是,从这些衡量标准的角度而言,最佳选择高度取决于数据集和问题设想方案。我们对这一取舍空间进行定性,通过对图像分辨率、图像质量和进化神经网络操作者进行系统和自动的调整,对准确性和效率权衡进行定量研究。根据这项研究的深入了解,我们提出了一个动态解析机制,以排除在时间之前静态选择分辨率的必要性。