今年CVPR与STR(场景文字识别)或DAR(文档图像分析与识别)相关的论文共22篇,相比于去年(CVPR 2019,17篇)增加了5篇,表明此领域的研究热度在持续增加。 CVPR 2020论文PDF全文已经可在官方网站下载,链接如下:http://openaccess.thecvf.com/CVPR2020.py 本文按场景文字检测、场景文字识别、文本数据合成、手写文字分析与识别、文档图像版面分析、文本VQA等十个类别对这22篇论文进行了分类梳理,简介如下(标*的论文表示该论文方法的代码已开源,共有9篇论文的代码已经开源,另外1篇论文公开了数据集)。 1
场景文字检测(2篇)
0
1 Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection*
0
2 ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection* 2
场景文字识别(4篇)
03 SCATTER:Selective Context Attentional Scene Text Recognizer
04 Towards Accurate Scene Text Recognition With Semantic Reasoning Networks
05 SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition* 06 On Vocabulary Reliance in Scene Text Recognition 3
端到端文字检测+识别(1篇)
07 ABCNet:Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network* 4
场景文字识别对抗攻击(1篇)
08 What Machines See Is Not What They Get:Fooling Scene Text Recognition Models With Adversarial Text Images
5
文本数据合成/数据增广/风格迁移/场景文字编辑(5篇)
09 ScrabbleGAN:Semi-Supervised Varying Length Handwritten Text Generation
10 Learn to Augment:Joint Data Augmentation and Network Optimization for Text Recognition*
11 UnrealText: Synthesizing Realistic Scene Text Images From the Unreal World*
12 SwapText: Image Based Texts Transfer in Scenes
13
STEFANN: Scene Text Editor Using Font Adaptive Neural Network* 6
文档图像处理(去阴影、碎片文档重构)(2篇)
14 BEDSR-Net: A Deep Shadow Removal Network From a Single Document Image (文中提到:本文数据集及代码将开源)
15 Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning
7
手写文字分析与识别(2篇)
16 Sequential Motif Profiles and Topological Plots for Offline Signature Verification
17 OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold* 8
文档图像版面分析(1篇)
18 Cross-Domain Document Object Detection: Benchmark Suite and Method 9
文本VQA(3篇)
19 On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering (数据集已公开)
20 Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text 21 Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA 10
其它(1篇)
下面这篇论文严格来说是并不是OCR或DAR领域的论文(属于计算机视觉及图像处理基础化技术的论文),但鉴于MSER曾经是文字检测领域最重要的方法之一,故小编也把此文列入。22 Fast MSER*