We propose a deep generative model that performs typography analysis and font reconstruction by learning disentangled manifolds of both font style and character shape. Our approach enables us to massively scale up the number of character types we can effectively model compared to previous methods. Specifically, we infer separate latent variables representing character and font via a pair of inference networks which take as input sets of glyphs that either all share a character type, or belong to the same font. This design allows our model to generalize to characters that were not observed during training time, an important task in light of the relative sparsity of most fonts. We also put forward a new loss, adapted from prior work that measures likelihood using an adaptive distribution in a projected space, resulting in more natural images without requiring a discriminator. We evaluate on the task of font reconstruction over various datasets representing character types of many languages, and compare favorably to modern style transfer systems according to both automatic and manually-evaluated metrics.
翻译:我们提出一个深层次的基因模型,通过学习字体样式和字符形状的分解元件来进行字型分析和字体重建。 我们的方法使我们能够大规模扩大与以往方法相比可以有效建模的字符类型数量。 具体地说, 我们通过一组推断网络来推断代表字符和字体的隐性变量,这些变量的输入组要么具有一个字符类型,要么属于同一字体。 这个设计让我们的模型能够概括在培训期间没有观察到的字符,这是根据大多数字体相对的宽度所做的一项重要任务。 我们还提出了一个新的损失,从先前的工作中调整,以衡量在预测空间使用适应性分布的可能性,从而产生更自然的图像,而不需要歧视。 我们评估了在代表多种语言的字符类型的各种数据集上的字体重建任务,并根据自动和手动评价的度量度对现代样式传输系统进行比较。