In this paper, we develop a novel transformer-based generative adversarial neural network called U-Transformer for generalised image outpainting problem. Different from most present image outpainting methods conducting horizontal extrapolation, our generalised image outpainting could extrapolate visual context all-side around a given image with plausible structure and details even for complicated scenery, building, and art images. Specifically, we design a generator as an encoder-to-decoder structure embedded with the popular Swin Transformer blocks. As such, our novel neural network can better cope with image long-range dependencies which are crucially important for generalised image outpainting. We propose additionally a U-shaped structure and multi-view Temporal Spatial Predictor (TSP) module to reinforce image self-reconstruction as well as unknown-part prediction smoothly and realistically. By adjusting the predicting step in the TSP module in the testing stage, we can generate arbitrary outpainting size given the input sub-image. We experimentally demonstrate that our proposed method could produce visually appealing results for generalized image outpainting against the state-of-the-art image outpainting approaches.
翻译:在本文中, 我们开发了一个新型的变压器基因对抗神经网络, 名为 U- Transfrench, 用于一般化图像外涂问题。 不同于目前大多数进行横向外推的图像外涂法方法, 我们的普通图像外涂图可以将视觉环境推到特定图像周围, 其结构与细节甚至对复杂的场景、 建筑和艺术图像来说都是合理的。 具体地说, 我们设计了一个发电机, 将它作为嵌入流行的 Swin 变异器块的编码器到脱色器结构。 因此, 我们的新神经网络可以更好地应对图像的远程依赖性, 这对于一般化图像外涂图至关重要。 我们提议了另外一种 U 形结构和多视图的时空空间图( TSP) 模块, 以加强图像自我重建以及未知部分预测的顺利和现实性。 通过调整测试阶段 TSP 模块的预测步骤, 我们可以在输入子图像时产生任意的偏差大小。 我们实验性地展示了我们所提议的方法可以产生直观的图像外向图像外涂图结果,, 以对抗状态图像外涂图。