YOWO- Plus: 逐步改进 (YOWO-Plus: An Incremental Improvement) - 专知论文

会员服务 ·

0

YOLO · Better · FPS · 优化器 · GIoU ·

2022 年 10 月 20 日

YOWO-Plus: An Incremental Improvement

翻译：YOWO- Plus: 逐步改进

from arxiv, 4 pages, 1 figure

In this technical report, we would like to introduce our updates to YOWO, a real-time method for spatio-temporal action detection. We make a bunch of little design changes to make it better. For network structure, we use the same ones of official implemented YOWO, including 3D-ResNext-101 and YOLOv2, but we use a better pretrained weight of our reimplemented YOLOv2, which is better than the official YOLOv2. We also optimize the label assignment used in YOWO. To accurately detection action instances, we deploy GIoU loss for box regression. After our incremental improvement, YOWO achieves 84.9\% frame mAP and 50.5\% video mAP on the UCF101-24, significantly higher than the official YOWO. On the AVA, our optimized YOWO achieves 20.6\% frame mAP with 16 frames, also exceeding the official YOWO. With 32 frames, our YOWO achieves 21.6 frame mAP with 25 FPS on an RTX 3090 GPU. We name the optimized YOWO as YOWO-Plus. Moreover, we replace the 3D-ResNext-101 with the efficient 3D-ShuffleNet-v2 to design a lightweight action detector, YOWO-Nano. YOWO-Nano achieves 81.0 \% frame mAP and 49.7\% video frame mAP with over 90 FPS on the UCF101-24. It also achieves 18.4 \% frame mAP with about 90 FPS on the AVA. As far as we know, YOWO-Nano is the fastest state-of-the-art action detector. Our code is available on https://github.com/yjh0410/PyTorch_YOWO.

翻译：在这份技术报告中,我们想向YOWO介绍我们的最新情况,这是实时检测时空动作的实时方法。我们做了一连串设计上的微小改动来改进它。对于网络结构,我们使用官方执行的YOWO的相同重量,包括3D-Resnext-101和YOLOOv2,但是我们使用我们重新实施的YOLOv2的预加工重量比官方的YOOLOO2要好。我们还优化了YOWO所使用的标签任务。为了精确检测行动,我们用GIOU损失来进行箱回归。在我们逐步改进之后,YOWO在UCFC101-24上实现了84.9 ⁇ 框架的 mAP和50.5英寸视频 mAP,大大高于官方的YOWO。在AVAO中,我们最优化的视频20.6×MO mO, 也超过了官方的YOOOOO。在32个框架上,我们的YOOOOOO OFO实现了21.6框架, 也达到了25个FPS,我们在90-10D框架上实现了我们的FS-10D框架。我们用了30D框架。我们用最优化的FO

0

相关内容

YOLO

YOLO是快速的端到端的目标检测深度网络

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

YOLOv3：An Incremental Improvement 全文翻译

YOLOv3：An Incremental Improvement 全文翻译

极市平台

12+阅读 · 2018年3月28日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

CrN/GaN异质结构的磁性和电输运性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

AGEs-MAPK-ENaCs信号通路在高血压水盐代谢中的调控作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

DNA凝聚的高分子场论研究：强带电聚电解质溶液中的涨落与关联效应

国家自然科学基金

0+阅读 · 2013年12月31日

高分子链间氢键解离的研究

国家自然科学基金

0+阅读 · 2012年12月31日

叶酸对动脉粥样硬化表观遗传学作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

稀土掺杂对Co基Heusler合金磁性和费米能级的调控

国家自然科学基金

0+阅读 · 2011年12月31日

高通量金属套管式微通道反应器捕集二氧化碳的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

Ir-Ln双金属发光配合物的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Fast Non-Rigid Radiance Fields from Monocularized Data

Arxiv

0+阅读 · 2022年12月2日

Relative Acoustic Features for Distance Estimation in Smart-Homes

Arxiv

0+阅读 · 2022年12月2日

Scene Editing as Teleoperation: A Case Study in 6DoF Kit Assembly

Arxiv

0+阅读 · 2022年12月2日

Game Implementation: What Are the Obstructions?

Arxiv

0+阅读 · 2022年12月1日

How to Train an Accurate and Efficient Object Detection Model on Any Dataset

Arxiv

0+阅读 · 2022年11月30日

BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?

Arxiv

0+阅读 · 2022年11月30日

Learning with Partial Labels from Semi-supervised Perspective

Arxiv

0+阅读 · 2022年11月30日

Micro Batch Streaming: Allowing the Training of DNN Models to Use a large Batch Size in Memory Constrained Environments

Arxiv

0+阅读 · 2022年11月30日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

VIP会员

文章信息

相关主题

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

YOLOv3：An Incremental Improvement 全文翻译

YOLOv3：An Incremental Improvement 全文翻译

极市平台

12+阅读 · 2018年3月28日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Fast Non-Rigid Radiance Fields from Monocularized Data

Arxiv

0+阅读 · 2022年12月2日

Relative Acoustic Features for Distance Estimation in Smart-Homes

Arxiv

0+阅读 · 2022年12月2日

Scene Editing as Teleoperation: A Case Study in 6DoF Kit Assembly

Arxiv

0+阅读 · 2022年12月2日

Game Implementation: What Are the Obstructions?

Arxiv

0+阅读 · 2022年12月1日

How to Train an Accurate and Efficient Object Detection Model on Any Dataset

Arxiv

0+阅读 · 2022年11月30日

BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?

Arxiv

0+阅读 · 2022年11月30日

Learning with Partial Labels from Semi-supervised Perspective

Arxiv

0+阅读 · 2022年11月30日

Micro Batch Streaming: Allowing the Training of DNN Models to Use a large Batch Size in Memory Constrained Environments

Arxiv

0+阅读 · 2022年11月30日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

相关基金

CrN/GaN异质结构的磁性和电输运性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

AGEs-MAPK-ENaCs信号通路在高血压水盐代谢中的调控作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

DNA凝聚的高分子场论研究：强带电聚电解质溶液中的涨落与关联效应

国家自然科学基金

0+阅读 · 2013年12月31日

高分子链间氢键解离的研究

国家自然科学基金

0+阅读 · 2012年12月31日

叶酸对动脉粥样硬化表观遗传学作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

稀土掺杂对Co基Heusler合金磁性和费米能级的调控

国家自然科学基金

0+阅读 · 2011年12月31日

高通量金属套管式微通道反应器捕集二氧化碳的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

Ir-Ln双金属发光配合物的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员