Scene text editing (STE) aims to replace text with the desired one while preserving background and styles of the original text. However, due to the complicated background textures and various text styles, existing methods fall short in generating clear and legible edited text images. In this study, we attribute the poor editing performance to two problems: 1) Implicit decoupling structure. Previous methods of editing the whole image have to learn different translation rules of background and text regions simultaneously. 2) Domain gap. Due to the lack of edited real scene text images, the network can only be well trained on synthetic pairs and performs poorly on real-world images. To handle the above problems, we propose a novel network by MOdifying Scene Text image at strokE Level (MOSTEL). Firstly, we generate stroke guidance maps to explicitly indicate regions to be edited. Different from the implicit one by directly modifying all the pixels at image level, such explicit instructions filter out the distractions from background and guide the network to focus on editing rules of text regions. Secondly, we propose a Semi-supervised Hybrid Learning to train the network with both labeled synthetic images and unpaired real scene text images. Thus, the STE model is adapted to real-world datasets distributions. Moreover, two new datasets (Tamper-Syn2k and Tamper-Scene) are proposed to fill the blank of public evaluation datasets. Extensive experiments demonstrate that our MOSTEL outperforms previous methods both qualitatively and quantitatively. Datasets and code will be available at https://github.com/qqqyd/MOSTEL.
翻译:显示文本编辑( STE) 的目的是在保存原始文本的背景和风格的同时用预想的文本替换文本, 并保存原始文本的背景和风格。 但是, 由于复杂的背景纹理和各种文本样式, 现有的方法在生成清晰可见的编辑文本图像方面不尽如人意。 在本研究中, 我们将错误的编辑性能归结为两个问题:(1) 隐含的脱钩结构。 编辑整个图像的以往方法必须同时学习背景和文本区域的不同翻译规则。 2) 域差。 由于缺少经过编辑的正版正版文本图像, 网络只能很好地训练合成对像, 在真实世界的图像上表现得很差。 为了处理上述问题, 我们建议通过在 strokE 级别( MOEL) 移动显示显示显示显示显示 Scen文本图像的新的网络。 我们生成的中线性指导地图将直接修改所有像素的图像, 将直接从背景中过滤偏差, 并指导网络的编辑文本规则。 其次, 我们提议一个半超级的混合学习系统, 将显示Stareal- sultal- dial- dial- dial- digradual- diversal diversal dial diversal divers diview data diview dlass diviewd dism dism d dismas