手写文本生成：从视觉原型到实现 (Handwritten Text Generation from Visual Archetypes) - 专知论文

会员服务 ·

0

书写风格 · 稳健 · 表示 · 文本生成 · 符号图 ·

2023 年 3 月 27 日

Handwritten Text Generation from Visual Archetypes

翻译：手写文本生成：从视觉原型到实现

Vittorio Pippi,Silvia Cascianelli,Rita Cucchiara

from arxiv, Accepted at CVPR2023

Generating synthetic images of handwritten text in a writer-specific style is a challenging task, especially in the case of unseen styles and new words, and even more when these latter contain characters that are rarely encountered during training. While emulating a writer's style has been recently addressed by generative models, the generalization towards rare characters has been disregarded. In this work, we devise a Transformer-based model for Few-Shot styled handwritten text generation and focus on obtaining a robust and informative representation of both the text and the style. In particular, we propose a novel representation of the textual content as a sequence of dense vectors obtained from images of symbols written as standard GNU Unifont glyphs, which can be considered their visual archetypes. This strategy is more suitable for generating characters that, despite having been seen rarely during training, possibly share visual details with the frequently observed ones. As for the style, we obtain a robust representation of unseen writers' calligraphy by exploiting specific pre-training on a large synthetic dataset. Quantitative and qualitative results demonstrate the effectiveness of our proposal in generating words in unseen styles and with rare characters more faithfully than existing approaches relying on independent one-hot encodings of the characters.

翻译：生成具有特定书写风格的手写文本合成图像是一项具有挑战性的任务，特别是在处理未见过的风格和新单词，尤其是其中包含训练时很少遇到的字符时。虽然最近已经有生成模型在模拟作者的风格方面取得了进展，但对于少见字符的泛化能力尚未得到关注。在本文中，我们设计了一种基于Transformer模型的小样本 styled handwritten text generation 模型，并关注于获取基于文字和书写风格的稳健且信息化的表示。尤其是，我们提出了一种将文本内容表示为由标准GNU Unifont字形编写成的符号图像的一系列密集向量的新颖表示法。这种策略更适合于生成在训练过程中只见过几次，但可能与常见字符共享视觉细节的字符。至于书写风格，我们通过在大规模合成数据集上进行特定的预训练，获得了对未见过作者书法的稳健表示。定量和定性结果表明，与依赖于字符的独立 one-hot 编码的现有方法相比，我们的方法更有效地生成了具有少见字符和新型式的单词。

0

相关内容

书写风格

【ACL2021】预训练语言模型的少样本知识图谱文本生成

专知会员服务

39+阅读 · 2021年6月6日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

【ICIP2019教程-NVIDIA】图像到图像转换，附7份PPT下载

【ICIP2019教程-NVIDIA】图像到图像转换，附7份PPT下载

专知会员服务

55+阅读 · 2019年11月20日

【ICCV2019最佳论文官方代码】Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"(从单一自然图像中学习的无条件生成模型) 附PDF论文

【ICCV2019最佳论文官方代码】Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"(从单一自然图像中学习的无条件生成模型) 附PDF论文

专知会员服务

22+阅读 · 2019年11月2日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

专知会员服务

16+阅读 · 2019年8月12日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

专知

10+阅读 · 2018年4月12日

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

专知

29+阅读 · 2018年3月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

磷酸化的OPN激活CALD聚合体的分子机制在无功能垂体腺瘤侵袭与复发中的研究

国家自然科学基金

0+阅读 · 2014年12月31日

环境目标散射成像模拟-反演特征识别的全极化SAR信息链

国家自然科学基金

0+阅读 · 2014年12月31日

自底向上的静态图像显著性检测

国家自然科学基金

1+阅读 · 2012年12月31日

纳米管形大碳笼金属富勒烯的合成、结构及其在光伏电池中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

并行子空间学习方法及其大规模图像识别应用研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于视频分析的儿童行为研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于合成样本和MCE准则下判别学习的汉字手写文本识别研究

国家自然科学基金

0+阅读 · 2009年12月31日

关于汉字手写信息处理及计算机书法生成若干算法问题的研究

国家自然科学基金

0+阅读 · 2009年12月31日

句子语义的视觉表示研究

国家自然科学基金

4+阅读 · 2009年12月31日

基于协同学的并行多层次反馈图像理解研究

国家自然科学基金

1+阅读 · 2008年12月31日

Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation

Arxiv

0+阅读 · 2023年5月17日

WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models

Arxiv

0+阅读 · 2023年5月17日

Smaller Language Models are Better Black-box Machine-Generated Text Detectors

Arxiv

0+阅读 · 2023年5月17日

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

Arxiv

0+阅读 · 2023年5月16日

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

Arxiv

0+阅读 · 2023年5月15日

Generating symbolic music using diffusion models

Arxiv

0+阅读 · 2023年5月15日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Sequence Level Contrastive Learning for Text Summarization

Sequence Level Contrastive Learning for Text Summarization

Arxiv

14+阅读 · 2021年9月24日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

VIP会员

文章信息

相关主题

相关VIP内容

【ACL2021】预训练语言模型的少样本知识图谱文本生成

专知会员服务

39+阅读 · 2021年6月6日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

【ICIP2019教程-NVIDIA】图像到图像转换，附7份PPT下载

【ICIP2019教程-NVIDIA】图像到图像转换，附7份PPT下载

专知会员服务

55+阅读 · 2019年11月20日

【ICCV2019最佳论文官方代码】Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"(从单一自然图像中学习的无条件生成模型) 附PDF论文

【ICCV2019最佳论文官方代码】Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"(从单一自然图像中学习的无条件生成模型) 附PDF论文

专知会员服务

22+阅读 · 2019年11月2日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

专知会员服务

16+阅读 · 2019年8月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

专知

10+阅读 · 2018年4月12日

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

专知

29+阅读 · 2018年3月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

Fusion-S2iGan: An Efficient and Effective Single-Stage Framework for Speech-to-Image Generation

Arxiv

0+阅读 · 2023年5月17日

WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models

Arxiv

0+阅读 · 2023年5月17日

Smaller Language Models are Better Black-box Machine-Generated Text Detectors

Arxiv

0+阅读 · 2023年5月17日

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

Arxiv

0+阅读 · 2023年5月16日

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

Arxiv

0+阅读 · 2023年5月15日

Generating symbolic music using diffusion models

Arxiv

0+阅读 · 2023年5月15日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Sequence Level Contrastive Learning for Text Summarization

Sequence Level Contrastive Learning for Text Summarization

Arxiv

14+阅读 · 2021年9月24日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

相关基金

磷酸化的OPN激活CALD聚合体的分子机制在无功能垂体腺瘤侵袭与复发中的研究

国家自然科学基金

0+阅读 · 2014年12月31日

环境目标散射成像模拟-反演特征识别的全极化SAR信息链

国家自然科学基金

0+阅读 · 2014年12月31日

自底向上的静态图像显著性检测

国家自然科学基金

1+阅读 · 2012年12月31日

纳米管形大碳笼金属富勒烯的合成、结构及其在光伏电池中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

并行子空间学习方法及其大规模图像识别应用研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于视频分析的儿童行为研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于合成样本和MCE准则下判别学习的汉字手写文本识别研究

国家自然科学基金

0+阅读 · 2009年12月31日

关于汉字手写信息处理及计算机书法生成若干算法问题的研究

国家自然科学基金

0+阅读 · 2009年12月31日

句子语义的视觉表示研究

国家自然科学基金

4+阅读 · 2009年12月31日

基于协同学的并行多层次反馈图像理解研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员