Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks. While reinforcement learning (RL) based agents can generate a stroke sequence step by step for this task, it is not easy to train a stable RL agent. On the other hand, stroke optimization methods search for a set of stroke parameters iteratively in a large search space; such low efficiency significantly limits their prevalence and practicality. Different from previous methods, in this paper, we formulate the task as a set prediction problem and propose a novel Transformer-based framework, dubbed Paint Transformer, to predict the parameters of a stroke set with a feed forward network. This way, our model can generate a set of strokes in parallel and obtain the final painting of size 512 * 512 in near real time. More importantly, since there is no dataset available for training the Paint Transformer, we devise a self-training pipeline such that it can be trained without any off-the-shelf dataset while still achieving excellent generalization capability. Experiments demonstrate that our method achieves better painting performance than previous ones with cheaper training and inference costs. Codes and models are available.
翻译:神经绘画是指为特定图像制作一系列中风的程序, 以及使用神经网络进行非光学和非光学现实的再造程序。 强化学习( RL) 基剂可以为此任务一步一步地生成中风序列, 但训练稳定的 RL 代理器不容易。 另一方面, 中风优化方法在大型搜索空间中迭接地寻找一套中风参数; 如此低效率会大大限制其流行性和实用性 。 与以前的方法不同, 在本文中, 我们把这个任务设计成一个设定的预测问题, 并提出一个新的基于变异器的框架, 以假涂料变异器为基础, 以预测中风集的参数, 并配上前方网络 。 这样, 我们的模型可以同时产生一组中风, 并在近实时获得512 * 512 的终局画。 更重要的是, 由于没有可供培训油漆变形器的数据集, 我们设计了一个自我培训管道, 这样它可以在没有现成的数据集的情况下被训练,, 并且仍然实现极好的通用能力。 实验表明我们的方法比以前的模型更便宜, 和标准成本。