This work addresses the problems of semantic segmentation and image super-resolution by jointly considering the performance of both in training a Generative Adversarial Network (GAN). We propose a novel architecture and domain-specific feature loss, allowing super-resolution to operate as a pre-processing step to increase the performance of downstream computer vision tasks, specifically semantic segmentation. We demonstrate this approach using Nearmap's aerial imagery dataset which covers hundreds of urban areas at 5-7 cm per pixel resolution. We show the proposed approach improves perceived image quality as well as quantitative segmentation accuracy across all prediction classes, yielding an average accuracy improvement of 11.8% and 108% at 4x and 32x super-resolution, compared with state-of-the art single-network methods. This work demonstrates that jointly considering image-based and task-specific losses can improve the performance of both, and advances the state-of-the-art in semantic-aware super-resolution of aerial imagery.
翻译:这项工作解决了语义分解和图像超分辨率问题,共同考虑在培训基因反转网络(GAN)方面的表现。我们提出一个新的结构和具体领域特征损失,允许超级分辨率作为提高下游计算机视觉任务(特别是语义分解)的预处理步骤,以提高下游计算机视觉任务(特别是语义分解)的绩效。我们用Nealmap的航空图像数据集展示了这一方法,该数据集覆盖数百个城市地区,每平方立方厘米5-7厘米。我们表明,拟议方法提高了所有预测类别图像的可感知质量和定量分解准确性,与最新一网络方法相比,4x和32x超级分辨率的准确性平均提高11.8%和108%。这项工作表明,共同考虑图像和特定任务损失可以改善两个图像的性能,并在语义-觉超分辨率上推进最新技术。