RewriteNet:现实场景文字通过真实世界图像中的编辑文本生成图像 (RewriteNet: Realistic Scene Text Image Generation via Editing Text in Real-world Image) - 专知论文

会员服务 ·

0

INFORMS · 原点 · Performer · Better · Performance ·

2021 年 7 月 23 日

RewriteNet: Realistic Scene Text Image Generation via Editing Text in Real-world Image

翻译：RewriteNet:现实场景文字通过真实世界图像中的编辑文本生成图像

Junyeop Lee,Yoonsik Kim,Seonghyeon Kim,Moonbin Yim,Seung Shin,Gayoung Lee,Sungrae Park

Scene text editing (STE), which converts a text in a scene image into the desired text while preserving an original style, is a challenging task due to a complex intervention between text and style. To address this challenge, we propose a novel representational learning-based STE model, referred to as RewriteNet that employs textual information as well as visual information. We assume that the scene text image can be decomposed into content and style features where the former represents the text information and style represents scene text characteristics such as font, alignment, and background. Under this assumption, we propose a method to separately encode content and style features of the input image by introducing the scene text recognizer that is trained by text information. Then, a text-edited image is generated by combining the style feature from the original image and the content feature from the target text. Unlike previous works that are only able to use synthetic images in the training phase, we also exploit real-world images by proposing a self-supervised training scheme, which bridges the domain gap between synthetic and real data. Our experiments demonstrate that RewriteNet achieves better quantitative and qualitative performance than other comparisons. Moreover, we validate that the use of text information and the self-supervised training scheme improves text switching performance. The implementation and dataset will be publicly available.

翻译：场景文本编辑 (STE) 将场景图像中的文本转换成理想文本, 并同时保留原始风格, 这是一项艰巨的任务, 原因是文本和风格之间的复杂干预。为了应对这一挑战, 我们提出一个新的基于演示学习的STE模型, 称为RewriteNet, 使用文本信息以及视觉信息。我们假设场景文本图像可以分解成内容和风格特征, 前者代表文本信息, 风格代表字体、校正和背景等场景文本特性。在此假设下, 我们提出一种方法, 通过引入由文本信息培训的场景文本识别器, 来单独编码输入图像的内容和风格特征。然后, 通过将原始图像的样式特征与目标文本的内容合并来生成文本编辑图像。与以往只能在培训阶段使用合成图像的工程不同, 我们还利用真实世界图像, 提出一个自我监督的培训计划, 弥补合成数据与真实数据之间的域间差距。我们的实验表明, RewriteNet 实现更好的定量和定性性能比其他比较。此外, 我们验证文本的性变校正将使用数据。我们将使用数据将改进。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

康奈尔大学「深度概率与生成模型」2021SP课程

专知会员服务

49+阅读 · 2021年4月24日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

专知会员服务

22+阅读 · 2020年3月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

专知会员服务

16+阅读 · 2019年8月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

Scene Graph Generation for Better Image Captioning?

Scene Graph Generation for Better Image Captioning?

Arxiv

0+阅读 · 2021年9月23日

In-Domain GAN Inversion for Real Image Editing

Arxiv

3+阅读 · 2020年7月16日

SwapText: Image Based Texts Transfer in Scenes

SwapText: Image Based Texts Transfer in Scenes

Arxiv

4+阅读 · 2020年3月18日

Bridging Knowledge Graphs to Generate Scene Graphs

Bridging Knowledge Graphs to Generate Scene Graphs

Arxiv

5+阅读 · 2020年1月7日

Pluralistic Image Completion

Pluralistic Image Completion

Arxiv

8+阅读 · 2019年3月11日

Using Scene Graph Context to Improve Image Generation

Using Scene Graph Context to Improve Image Generation

Arxiv

3+阅读 · 2019年1月15日

Mask-aware Photorealistic Face Attribute Manipulation

Arxiv

5+阅读 · 2018年4月24日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

5+阅读 · 2018年2月13日

Disentangled Person Image Generation

Arxiv

7+阅读 · 2018年1月21日

Semi-supervised FusedGAN for Conditional Image Generation

Arxiv

8+阅读 · 2018年1月17日

VIP会员

文章信息

相关主题

相关VIP内容

康奈尔大学「深度概率与生成模型」2021SP课程

专知会员服务

49+阅读 · 2021年4月24日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

专知会员服务

22+阅读 · 2020年3月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

专知会员服务

16+阅读 · 2019年8月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

《商用大语言模型的升级风险管理：国家安全运用》

自主人工智能：未来战争是否将是自主化的？

《从装备到文化：美陆军技术素养建设启示录》最新报告

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

Scene Graph Generation for Better Image Captioning?

Scene Graph Generation for Better Image Captioning?

Arxiv

0+阅读 · 2021年9月23日

In-Domain GAN Inversion for Real Image Editing

Arxiv

3+阅读 · 2020年7月16日

SwapText: Image Based Texts Transfer in Scenes

SwapText: Image Based Texts Transfer in Scenes

Arxiv

4+阅读 · 2020年3月18日

Bridging Knowledge Graphs to Generate Scene Graphs

Bridging Knowledge Graphs to Generate Scene Graphs

Arxiv

5+阅读 · 2020年1月7日

Pluralistic Image Completion

Pluralistic Image Completion

Arxiv

8+阅读 · 2019年3月11日

Using Scene Graph Context to Improve Image Generation

Using Scene Graph Context to Improve Image Generation

Arxiv

3+阅读 · 2019年1月15日

Mask-aware Photorealistic Face Attribute Manipulation

Arxiv

5+阅读 · 2018年4月24日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

5+阅读 · 2018年2月13日

Disentangled Person Image Generation

Arxiv

7+阅读 · 2018年1月21日

Semi-supervised FusedGAN for Conditional Image Generation

Arxiv

8+阅读 · 2018年1月17日

微信扫码咨询专知VIP会员