实时现场文字探测,有区别的感应和适应性规模融合 (Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion) - 专知论文

会员服务 ·

0

缩放 · 稳健性 · Extensibility · 模型评估 · Networking ·

2022 年 2 月 21 日

Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion

翻译：实时现场文字探测,有区别的感应和适应性规模融合

Minghui Liao,Zhisheng Zou,Zhaoyi Wan,Cong Yao,Xiang Bai

from arxiv, Accepted by TPAMI. arXiv admin note: substantial text overlap with arXiv:1911.08947

Recently, segmentation-based scene text detection methods have drawn extensive attention in the scene text detection field, because of their superiority in detecting the text instances of arbitrary shapes and extreme aspect ratios, profiting from the pixel-level descriptions. However, the vast majority of the existing segmentation-based approaches are limited to their complex post-processing algorithms and the scale robustness of their segmentation models, where the post-processing algorithms are not only isolated to the model optimization but also time-consuming and the scale robustness is usually strengthened by fusing multi-scale feature maps directly. In this paper, we propose a Differentiable Binarization (DB) module that integrates the binarization process, one of the most important steps in the post-processing procedure, into a segmentation network. Optimized along with the proposed DB module, the segmentation network can produce more accurate results, which enhances the accuracy of text detection with a simple pipeline. Furthermore, an efficient Adaptive Scale Fusion (ASF) module is proposed to improve the scale robustness by fusing features of different scales adaptively. By incorporating the proposed DB and ASF with the segmentation network, our proposed scene text detector consistently achieves state-of-the-art results, in terms of both detection accuracy and speed, on five standard benchmarks.

翻译：最近,基于分解的现场文字探测方法在现场文本探测场中引起了广泛的注意,因为这些方法在发现任意形状和极端方面比率的文字实例方面具有优势,从像素级说明中获益,但是,绝大多数现有的分解方法都局限于复杂的后处理算法及其分解模型的规模稳健性,在这些模型中,后处理算法不仅与模型优化分离,而且耗时和比例稳健性通常通过直接使用多尺度地貌图而得到加强。在本文件中,我们建议采用一个可区分的比亚化(DB)模块,将二进制进程(后处理程序中最重要的步骤之一)纳入分解网络。与拟议的DB模块一起,分解网络可以产生更准确的结果,从而通过简单的管道提高文本检测的准确性。此外,还提议了一个高效的适应性比例调控调系统(ASF)模块,通过调控不同尺度的特征来提高比例的稳健性。通过将拟议的DB和ASF系统(后处理程序最重要的步骤之一)纳入后处理程序中的二进制过程,并在拟议的分解速度网络中持续地标中实现我们拟议的DB和自动检测的分解速度基准。

1

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

多层时空并行 Schwarz 算法的研究

国家自然科学基金

3+阅读 · 2017年12月31日

嗜热链球菌全局转录调控因子CodY对Ⅱ型CRISPR/Cas系统的调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

压电力显微成像中机电耦合机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

云服务与网络服务一体化环境下的QoS保障机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于新型小生境策略的多模态、多目标、动态进化算法的研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向GPU的电力系统电磁暂态并行计算方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类拟线性Schrodinger方程(组)解的存在性和集中现象研究

国家自然科学基金

0+阅读 · 2012年12月31日

医疗服务中的资源调度与优化

国家自然科学基金

4+阅读 · 2011年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

实时数据流中动态模式的发现与跟踪

国家自然科学基金

0+阅读 · 2009年12月31日

Shape-Aware Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年4月19日

M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection

Arxiv

0+阅读 · 2022年4月19日

M$^2$BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation

Arxiv

0+阅读 · 2022年4月19日

Towards Robust Neural Networks via Orthogonal Diversity

Towards Robust Neural Networks via Orthogonal Diversity

Arxiv

0+阅读 · 2022年4月18日

Imposing Consistency for Optical Flow Estimation

Arxiv

0+阅读 · 2022年4月14日

Scene Graph Generation: A Comprehensive Survey

Arxiv

26+阅读 · 2022年1月3日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Arxiv

10+阅读 · 2020年3月31日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

相关论文

Shape-Aware Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年4月19日

M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection

Arxiv

0+阅读 · 2022年4月19日

M$^2$BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation

Arxiv

0+阅读 · 2022年4月19日

Towards Robust Neural Networks via Orthogonal Diversity

Towards Robust Neural Networks via Orthogonal Diversity

Arxiv

0+阅读 · 2022年4月18日

Imposing Consistency for Optical Flow Estimation

Arxiv

0+阅读 · 2022年4月14日

Scene Graph Generation: A Comprehensive Survey

Arxiv

26+阅读 · 2022年1月3日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Arxiv

10+阅读 · 2020年3月31日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

多层时空并行 Schwarz 算法的研究

国家自然科学基金

3+阅读 · 2017年12月31日

嗜热链球菌全局转录调控因子CodY对Ⅱ型CRISPR/Cas系统的调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

压电力显微成像中机电耦合机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

云服务与网络服务一体化环境下的QoS保障机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于新型小生境策略的多模态、多目标、动态进化算法的研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向GPU的电力系统电磁暂态并行计算方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类拟线性Schrodinger方程(组)解的存在性和集中现象研究

国家自然科学基金

0+阅读 · 2012年12月31日

医疗服务中的资源调度与优化

国家自然科学基金

4+阅读 · 2011年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

实时数据流中动态模式的发现与跟踪

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员