While most present image outpainting conducts horizontal extrapolation, we study the generalised image outpainting problem that extrapolates visual context all-side around a given image. To this end, we develop a novel transformer-based generative adversarial network called U-Transformer able to extend image borders with plausible structure and details even for complicated scenery images. Specifically, we design a generator as an encoder-to-decoder structure embedded with the popular Swin Transformer blocks. As such, our novel framework can better cope with image long-range dependencies which are crucially important for generalised image outpainting. We propose additionally a U-shaped structure and multi-view Temporal Spatial Predictor network to reinforce image self-reconstruction as well as unknown-part prediction smoothly and realistically. We experimentally demonstrate that our proposed method could produce visually appealing results for generalized image outpainting against the state-of-the-art image outpainting approaches.
翻译:虽然大多数目前的图像外涂图都进行横向外推,但我们研究的是一般化图像外涂问题,它把视觉背景外涂到特定图像周围。 为此,我们开发了一个新型的变压器基因对抗网络,称为U- Transexter, 能够扩展图像边框,其结构和细节合理,甚至对复杂的场景图像来说也是如此。具体地说,我们设计了一个生成器,作为嵌入流行的 Swin 变换器块的编码器到解码器结构。因此,我们的新框架可以更好地应对图像的远程依赖性,这对一般化图像外涂图至关重要。我们还提议了一个基于变形结构和多视图的时空空间预测网络,以加强图像自我重建以及未知部分预测的顺利和现实性。我们实验性地证明,我们所提议的方法可以产生具有视觉吸引力的结果,用于针对最先进的图像外涂图方法的通用图像外涂图。