Font generation is a difficult and time-consuming task, especially in those languages using ideograms that have complicated structures with a large number of characters, such as Chinese. To solve this problem, few-shot font generation and even one-shot font generation have attracted a lot of attention. However, most existing font generation methods may still suffer from (i) large cross-font gap challenge; (ii) subtle cross-font variation problem; and (iii) incorrect generation of complicated characters. In this paper, we propose a novel one-shot font generation method based on a diffusion model, named Diff-Font, which can be stably trained on large datasets. The proposed model aims to generate the entire font library by giving only one sample as the reference. Specifically, a large stroke-wise dataset is constructed, and a stroke-wise diffusion model is proposed to preserve the structure and the completion of each generated character. To our best knowledge, the proposed Diff-Font is the first work that developed diffusion models to handle the font generation task. The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation. Compared to previous font generation methods, our model reaches state-of-the-art performance both qualitatively and quantitatively.
翻译:字体生成是一项困难且耗时的任务,特别是对于使用复杂字符结构的表意文字(如中文)的语言。为了解决这个问题,少样本字体生成甚至单样本字体生成引起了广泛关注。然而,大多数现有的字体生成方法仍然可能遭受以下问题:(i)巨大的跨字体差距挑战;(ii)微妙的跨字体变化问题;(iii)复杂字符的错误生成。在本文中,我们提出了一种名为 Diff-Font 的基于扩散模型的新型单样本字体生成方法,它可以在大型数据集上进行稳定的训练。所提出的模型旨在通过仅提供一个样本作为参考来生成整个字体库。具体而言,构建了一个大型笔画数据集,提出了笔画扩散模型以保留每个生成字符的结构和完成情况。据我们所知,所提出的 Diff-Font 是第一篇开发用于处理字体生成任务的扩散模型的论文。经过良好培训的 Diff-Font 不仅对字体间差距和字体变化具有鲁棒性,而且在难以生成的字符生成方面取得了有望的性能。与以前的字体生成方法相比,我们的模型在质量和数量上都达到了最新的性能水平。