成本效率高的终端到终端文本显示器动态低分辨率低分辨率蒸馏 (Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting) - 专知论文

会员服务 ·

0

蒸馏 · 端到端 · Extensibility · 模型评估 · 优化器 ·

2022 年 7 月 15 日

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting

翻译：成本效率高的终端到终端文本显示器动态低分辨率低分辨率蒸馏

Ying Chen,Liang Qiao,Zhanzhan Cheng,Shiliang Pu,Yi Niu,Xi Li

from arxiv, Accept by ECCV2022

End-to-end text spotting has attached great attention recently due to its benefits on global optimization and high maintainability for real applications. However, the input scale has always been a tough trade-off since recognizing a small text instance usually requires enlarging the whole image, which brings high computational costs. In this paper, to address this problem, we propose a novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting framework, which aims to infer images in different small but recognizable resolutions and achieve a better balance between accuracy and efficiency. Concretely, we adopt a resolution selector to dynamically decide the input resolutions for different images, which is constraint by both inference accuracy and computational cost. Another sequential knowledge distillation strategy is conducted on the text recognition branch, making the low-res input obtains comparable performance to a high-res image. The proposed method can be optimized end-to-end and adopted in any current text spotting framework to improve the practicability. Extensive experiments on several text spotting benchmarks show that the proposed method vastly improves the usability of low-res models. The code is available at https://github.com/hikopensource/DAVAR-Lab-OCR/.

翻译：最近,由于输入比例表对全球优化的好处和真实应用的高度维护性的好处,对端对端文本定位最近给予了极大关注。然而,输入比例表始终是一个艰难的权衡,因为承认一个小文本实例通常要求扩大整个图像,从而带来很高的计算成本。在本文件中,为了解决这一问题,我们提议了一个新的成本效率高的动态低分辨率蒸馏(DLD)文本定位框架,目的是从不同的小但可识别的分辨率中推断图像,并在准确性和效率之间实现更好的平衡。具体地说,我们通过一个分辨率选择器,以动态方式决定不同图像的输入分辨率,这是由推断准确性和计算成本两者制约的。另一个顺序知识蒸馏战略是在文本识别分支上进行,使低文本输入获得与高分辨率图像相似的性能。拟议方法可以优化端对端,并在任何现有文本定位框架中采用,以提高实用性。若干文本标识基准的广泛实验表明,拟议的方法大大改进了低可读性模型的易读性。该代码可在 http://L/CRUB/AURSouras/AVAR/CRIFF O上查阅。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

SAR影像古遗址自动检测方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于相干自适应光学的深层双光子动态荧光寿命成像研究

国家自然科学基金

0+阅读 · 2015年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

用于太阳能电池的有序金属/介质纳米结构高效光吸收器的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于地基激光雷达的全生育期玉米叶面积指数提取方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

猪札幌病毒受体鉴定研究

国家自然科学基金

0+阅读 · 2012年12月31日

微合金钢连铸过程铸坯表层原始奥氏体晶粒细化研究

国家自然科学基金

0+阅读 · 2011年12月31日

周期素DNA甲基化调控对高同型半胱氨酸血症大鼠内皮祖细胞功能的影响

国家自然科学基金

0+阅读 · 2009年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Enhancing Pre-trained Models with Text Structure Knowledge for Question Generation

Arxiv

0+阅读 · 2022年9月9日

Multi-Granularity Prediction for Scene Text Recognition

Arxiv

0+阅读 · 2022年9月8日

Multimodal Speech Enhancement Using Burst Propagation

Arxiv

0+阅读 · 2022年9月7日

Efficient Transformers: A Survey

Arxiv

35+阅读 · 2022年3月14日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Dynamic Neural Networks: A Survey

Arxiv

37+阅读 · 2021年2月10日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Dynamic Zoom-in Network for Fast Object Detection in Large Images

Arxiv

20+阅读 · 2018年3月27日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型时代的文档智能：综述

蜂窝通信是否是无人机与无人地面战车主宰战场的关键？

文档视觉问答简述

最新新Agent综述！76页327篇论文梳理，北交大桑基韬教授团队发布《迈向模型原生智能体式人工智能的范式转变综述》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Enhancing Pre-trained Models with Text Structure Knowledge for Question Generation

Arxiv

0+阅读 · 2022年9月9日

Multi-Granularity Prediction for Scene Text Recognition

Arxiv

0+阅读 · 2022年9月8日

Multimodal Speech Enhancement Using Burst Propagation

Arxiv

0+阅读 · 2022年9月7日

Efficient Transformers: A Survey

Arxiv

35+阅读 · 2022年3月14日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Dynamic Neural Networks: A Survey

Arxiv

37+阅读 · 2021年2月10日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Dynamic Zoom-in Network for Fast Object Detection in Large Images

Arxiv

20+阅读 · 2018年3月27日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

相关基金

SAR影像古遗址自动检测方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于相干自适应光学的深层双光子动态荧光寿命成像研究

国家自然科学基金

0+阅读 · 2015年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

用于太阳能电池的有序金属/介质纳米结构高效光吸收器的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于地基激光雷达的全生育期玉米叶面积指数提取方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

猪札幌病毒受体鉴定研究

国家自然科学基金

0+阅读 · 2012年12月31日

微合金钢连铸过程铸坯表层原始奥氏体晶粒细化研究

国家自然科学基金

0+阅读 · 2011年12月31日

周期素DNA甲基化调控对高同型半胱氨酸血症大鼠内皮祖细胞功能的影响

国家自然科学基金

0+阅读 · 2009年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员