E ⁇ 2VTS:来自无人驾驶航空器的节能视频文本 (E^2VTS: Energy-Efficient Video Text Spotting from Unmanned Aerial Vehicles) - 专知论文

会员服务 ·

0

Extensibility · motivation · 可约的 · 无人机 · Continuity ·

2022 年 6 月 5 日

E^2VTS: Energy-Efficient Video Text Spotting from Unmanned Aerial Vehicles

翻译：E ⁇ 2VTS:来自无人驾驶航空器的节能视频文本

Zhenyu Hu,Zhenyu Wu,Pengcheng Pi,Yunhe Xue,Jiayi Shen,Jianchao Tan,Xiangru Lian,Zhangyang Wang,Ji Liu

Unmanned Aerial Vehicles (UAVs) based video text spotting has been extensively used in civil and military domains. UAV's limited battery capacity motivates us to develop an energy-efficient video text spotting solution. In this paper, we first revisit RCNN's crop & resize training strategy and empirically find that it outperforms aligned RoI sampling on a real-world video text dataset captured by UAV. To reduce energy consumption, we further propose a multi-stage image processor that takes videos' redundancy, continuity, and mixed degradation into account. Lastly, the model is pruned and quantized before deployed on Raspberry Pi. Our proposed energy-efficient video text spotting solution, dubbed as E^2VTS, outperforms all previous methods by achieving a competitive tradeoff between energy efficiency and performance. All our codes and pre-trained models are available at https://github.com/wuzhenyusjtu/LPCVC20-VideoTextSpotting.

翻译：无人驾驶航空飞行器(UAVs)的视频文本定位已被广泛用于民用和军事领域。无人驾驶航空飞行器(UAVs)的有限电池容量激励我们开发一个节能视频文本检测解决方案。在本文中,我们首先重新审视RCNN的作物和调整规模培训战略,并从经验上发现它比UAV所捕捉的真实世界视频文本数据集的匹配性RoI取样效果要强。为了减少能源消耗,我们进一步提议建立一个多阶段图像处理器,将视频的冗余、连续性和混合降解考虑在内。最后,该模型在安装在Raspberry Pi之前就已经进行了剪裁和量化。我们提议的节能视频文本检测解决方案被称为E&2VTS,通过实现能源效率和性能之间的竞争性权衡,超越了以往所有方法。我们的所有代码和事先培训模型都可在https://github.com/wuzhenyusjtu/LPCVC20-VideoTextSpoting上查阅。

0

相关内容

Extensibility

iOS 8 提供的应用间和应用跟系统的功能交互特性。

Today (iOS and OS X): widgets for the Today view of Notification Center
Share (iOS and OS X): post content to web services or share content with others
Actions (iOS and OS X): app extensions to view or manipulate inside another app
Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
Custom Keyboard (iOS): system-wide alternative keyboards

Source: iOS 8 Extensions: Apple’s Plan for a Powerful App Ecosystem

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

[ICCV 2021] 从二到一：一种带有视觉语言建模网络的新场景文本识别器

专知会员服务

17+阅读 · 2021年10月17日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

Citron调控Wnt/β-catenin通路促进结肠癌恶变进程的作用及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

雄激素经AR/PI3K/AKT通路调控CA916798参与肺腺癌发生的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Erbin介导细胞周期异常与肿瘤发生的关系

国家自然科学基金

0+阅读 · 2012年12月31日

基于多尺度leaders多重分形与多尺度约束PCA的汽车起重机主泵特征提取方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt-Notch和Wnt-ERBB信号通路调控NSCLC上皮间质转化和耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

离子注入制备InN基n-沟道铁电场效应晶体管

国家自然科学基金

0+阅读 · 2011年12月31日

一种适用于高维问题的Co-kriging代理模型新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

传感器网络中的分布式融合状态估计算法研究

国家自然科学基金

1+阅读 · 2008年12月31日

D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding

Arxiv

0+阅读 · 2022年7月22日

QueryProp: Object Query Propagation for High-Performance Video Object Detection

Arxiv

0+阅读 · 2022年7月22日

Dense RGB-D-Inertial SLAM with Map Deformations

Arxiv

0+阅读 · 2022年7月22日

An advanced combination of semi-supervised Normalizing Flow & Yolo (YoloNF) to detect and recognize vehicle license plates

Arxiv

0+阅读 · 2022年7月21日

StreamYOLO: Real-time Object Detection for Streaming Perception

Arxiv

0+阅读 · 2022年7月21日

Is an Object-Centric Video Representation Beneficial for Transfer?

Is an Object-Centric Video Representation Beneficial for Transfer?

Arxiv

0+阅读 · 2022年7月20日

Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Arxiv

0+阅读 · 2022年7月20日

Exploiting Fine-grained Face Forgery Clues via Progressive Enhancement Learning

Arxiv

12+阅读 · 2021年12月28日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

[ICCV 2021] 从二到一：一种带有视觉语言建模网络的新场景文本识别器

专知会员服务

17+阅读 · 2021年10月17日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding

Arxiv

0+阅读 · 2022年7月22日

QueryProp: Object Query Propagation for High-Performance Video Object Detection

Arxiv

0+阅读 · 2022年7月22日

Dense RGB-D-Inertial SLAM with Map Deformations

Arxiv

0+阅读 · 2022年7月22日

An advanced combination of semi-supervised Normalizing Flow & Yolo (YoloNF) to detect and recognize vehicle license plates

Arxiv

0+阅读 · 2022年7月21日

StreamYOLO: Real-time Object Detection for Streaming Perception

Arxiv

0+阅读 · 2022年7月21日

Is an Object-Centric Video Representation Beneficial for Transfer?

Is an Object-Centric Video Representation Beneficial for Transfer?

Arxiv

0+阅读 · 2022年7月20日

Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Arxiv

0+阅读 · 2022年7月20日

Exploiting Fine-grained Face Forgery Clues via Progressive Enhancement Learning

Arxiv

12+阅读 · 2021年12月28日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

Citron调控Wnt/β-catenin通路促进结肠癌恶变进程的作用及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

雄激素经AR/PI3K/AKT通路调控CA916798参与肺腺癌发生的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Erbin介导细胞周期异常与肿瘤发生的关系

国家自然科学基金

0+阅读 · 2012年12月31日

基于多尺度leaders多重分形与多尺度约束PCA的汽车起重机主泵特征提取方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt-Notch和Wnt-ERBB信号通路调控NSCLC上皮间质转化和耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

离子注入制备InN基n-沟道铁电场效应晶体管

国家自然科学基金

0+阅读 · 2011年12月31日

一种适用于高维问题的Co-kriging代理模型新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

传感器网络中的分布式融合状态估计算法研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员