SPSS:单点文本显示 (SPTS: Single-Point Text Spotting) - 专知论文

会员服务 ·

0

端到端 · Performer · 离散化 · state-of-the-art · 查准率/准确率 ·

2022 年 4 月 19 日

SPTS: Single-Point Text Spotting

翻译：SPSS:单点文本显示

Dezhi Peng,Xinyu Wang,Yuliang Liu,Jiaxin Zhang,Mingxin Huang,Songxuan Lai,Shenggao Zhu,Jing Li,Dahua Lin,Chunhua Shen,Lianwen Jin,Xiang Bai

Existing scene text spotting (i.e., end-to-end text detection and recognition) methods rely on costly bounding box annotations (e.g., text-line, word-level, or character-level bounding boxes). For the first time, we demonstrate that training scene text spotting models can be achieved with an extremely low-cost annotation of a single-point for each instance. We propose an end-to-end scene text spotting method that tackles scene text spotting as a sequence prediction task. Given an image as input, we formulate the desired detection and recognition results as a sequence of discrete tokens and use an auto-regressive Transformer to predict the sequence. The proposed method is simple yet effective, which can achieve state-of-the-art results on widely used benchmarks. Most significantly, we show that the performance is not very sensitive to the positions of the point annotation, meaning that it can be much easier to be annotated or even be automatically generated than the bounding box that requires precise positions. We believe that such a pioneer attempt indicates a significant opportunity for scene text spotting applications of a much larger scale than previously possible. The code will be publicly available.

翻译：现有场景文本定位(即端到端的文本检测和识别)方法依赖于昂贵的捆绑框附加说明(例如,文本线、字级或字符级绑定框)方法。我们第一次证明,培训场景文本定位模型可以通过对每个实例的单点进行极低的成本批注来实现。我们提出了一个端到端的文本定位方法,将现场文本定位作为序列预测任务进行处理。图像作为输入,我们将预期的检测和识别结果作为离散符号序列,并使用自动递增变换器来预测序列。拟议的方法简单而有效,可以在广泛使用的基准上实现最新艺术效果。最重要的是,我们表明,性能对于点标识的位置并不十分敏感,这意味着比需要精确位置的绑定框更容易得到附加说明,甚至自动生成。我们相信,这样的先驱尝试将表明一个比以往可能提供的要大得多的场景文本定位应用的显著机会。

0

相关内容

端到端

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Periostin-avβ3-FAK-PI3K通路在褐藻糖胶抗乳腺癌转移中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-HOTAIR在前列腺癌恶性进展及肿瘤干细胞中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

Caspase-9受凋亡复合体激活起始凋亡的结构生物学研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cocycle动力学和拟周期薛定谔算子的谱

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

HIV-1 Tat蛋白诱导血脑屏障破坏及其作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

肝癌细胞膜蛋白cytokeratin-1用于肝癌在体分子显像和靶向治疗的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于微波光子学信号处理的超快飞秒测距激光雷达

国家自然科学基金

0+阅读 · 2011年12月31日

Privacy Leakage in Text Classification: A Data Extraction Approach

Arxiv

0+阅读 · 2022年6月9日

Improved two-stage hate speech classification for twitter based on Deep Neural Networks

Arxiv

0+阅读 · 2022年6月8日

Point RCNN: An Angle-Free Framework for Rotated Object Detection

Arxiv

0+阅读 · 2022年6月8日

Fooling Explanations in Text Classifiers

Arxiv

0+阅读 · 2022年6月7日

Deep Learning for UAV-based Object Detection and Tracking: A Survey

Arxiv

64+阅读 · 2021年10月25日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

state-of-the-art

查准率/准确率

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】在线学习：优化、控制与学习理论

不确定环境下无人机三维路径规划研究 | 221页

【NeurIPS2025】《LeapFactual：基于条件流匹配的可靠视觉反事实解释》

大语言模型将如何改变军事指挥结构

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Privacy Leakage in Text Classification: A Data Extraction Approach

Arxiv

0+阅读 · 2022年6月9日

Improved two-stage hate speech classification for twitter based on Deep Neural Networks

Arxiv

0+阅读 · 2022年6月8日

Point RCNN: An Angle-Free Framework for Rotated Object Detection

Arxiv

0+阅读 · 2022年6月8日

Fooling Explanations in Text Classifiers

Arxiv

0+阅读 · 2022年6月7日

Deep Learning for UAV-based Object Detection and Tracking: A Survey

Arxiv

64+阅读 · 2021年10月25日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

相关基金

Periostin-avβ3-FAK-PI3K通路在褐藻糖胶抗乳腺癌转移中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-HOTAIR在前列腺癌恶性进展及肿瘤干细胞中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

Caspase-9受凋亡复合体激活起始凋亡的结构生物学研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cocycle动力学和拟周期薛定谔算子的谱

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

HIV-1 Tat蛋白诱导血脑屏障破坏及其作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

肝癌细胞膜蛋白cytokeratin-1用于肝癌在体分子显像和靶向治疗的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于微波光子学信号处理的超快飞秒测距激光雷达

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员