In most existing learning systems, images are typically viewed as 2D pixel arrays. However, in another paradigm gaining popularity, a 2D image is represented as an implicit neural representation (INR) - an MLP that predicts an RGB pixel value given its (x,y) coordinate. In this paper, we propose two novel architectural techniques for building INR-based image decoders: factorized multiplicative modulation and multi-scale INRs, and use them to build a state-of-the-art continuous image GAN. Previous attempts to adapt INRs for image generation were limited to MNIST-like datasets and do not scale to complex real-world data. Our proposed INR-GAN architecture improves the performance of continuous image generators by several times, greatly reducing the gap between continuous image GANs and pixel-based ones. Apart from that, we explore several exciting properties of the INR-based decoders, like out-of-the-box superresolution, meaningful image-space interpolation, accelerated inference of low-resolution images, an ability to extrapolate outside of image boundaries, and strong geometric prior. The project page is located at https://universome.github.io/inr-gan.
翻译:在大多数现有的学习系统中,图像通常被视为2D像素阵列。然而,在另一个日益受欢迎的范例中,2D图像被表现为隐性神经显示器(INR)——一个预测RGB像素值的 MLP(x,y)坐标。在本文中,我们提议了两种新颖的建筑技术,用于建设以IRS为基础的图像解码器:乘以多倍式调制和多尺度的IRS,并用来构建一个最先进的连续图像GAN。以前为图像生成而调整IRS的尝试仅限于像MNIST一样的数据集,而不适用于复杂的真实世界数据。我们提议的IRS-GAN结构将连续图像生成器的性能提高好几次,大大缩小连续图像GANs和以像素为基础的图像脱钩器之间的鸿沟。除此之外,我们探索了以IRS为基础的解码器的一些令人振奋的特性,如超标准超分辨率超分辨率、有意义的图像空间干涉、加速低分辨率图像的推断、快速度低分辨率图像的超强性能/超强的外部图像项目。