多数据集的变革者培训,以便采取强有力的行动,承认强力行动 (Multi-dataset Training of Transformers for Robust Action Recognition) - 专知论文

会员服务 ·

0

稳健性 · 损失 · INFORMS · AIM · Projection ·

2022 年 9 月 26 日

Multi-dataset Training of Transformers for Robust Action Recognition

翻译：多数据集的变革者培训,以便采取强有力的行动,承认强力行动

Junwei Liang,Enwei Zhang,Jun Zhang,Chunhua Shen

from arxiv, Accepted by NeurIPS 2022. Version 1. Code and models are available at https://github.com/JunweiLiang/MultiTrain

We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition. We build our method on Transformers for its efficacy. Although we have witnessed great progress for video action recognition in the past decade, it remains challenging yet valuable how to train a single model that can perform well across multiple datasets. Here, we propose a novel multi-dataset training paradigm, MultiTrain, with the design of two new loss terms, namely informative loss and projection loss, aiming to learn robust representations for action recognition. In particular, the informative loss maximizes the expressiveness of the feature embedding while the projection loss for each dataset mines the intrinsic relations between classes across datasets. We verify the effectiveness of our method on five challenging datasets, Kinetics-400, Kinetics-700, Moments-in-Time, Activitynet and Something-something-v2 datasets. Extensive experimental results show that our method can consistently improve the state-of-the-art performance.

翻译：我们研究了强健的地貌表现任务,旨在全面推广多种数据集,以利行动识别。我们建立了以变异器为主的方法。虽然我们在过去十年里在视频动作识别方面取得了巨大进展,但是,它仍然具有挑战性,但却很有价值,如何训练一个能够跨越多个数据集很好地发挥作用的单一模型。在这里,我们提出了一个新的多数据集培训模式,即多功能培训模式,设计了两个新的损失术语,即信息化损失和投射损失,目的是为行动识别学习强有力的演示。特别是,信息化损失使特性嵌入的清晰度最大化,而每颗数据集的预测损失则使各数据集之间的内在关系跨数据集之间的内在关系最大化。我们核查了我们在五个具有挑战性的数据集、动因-400、动因-700、动因-700、运动-即时、活动网和某样-东西-V2数据集上的方法的有效性。广泛的实验结果表明,我们的方法可以不断改进最新性能。

0

相关内容

稳健性

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

重载车辆ECAS/CTIS集成系统耦合机理及主动控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于柚子型光子晶体光纤长周期光栅的痕量TNT爆炸物探测原理与方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

IMPDH为靶点的小分子抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2012年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

高速撞击流反应器内流动和微观混合特性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

乙肝病毒诱导FoxC1表达在促进肝癌转移中的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

野菊花中萜类和黄酮类化合物抗乙肝病毒活性协同作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

适应纳米尺度CMOS集成电路DFM的ULTRA模型完善和偏差模拟技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于EcR/USP结构的新型IGRs先导化合物的生物合理设计

国家自然科学基金

0+阅读 · 2008年12月31日

State-of-the-art Models for Object Detection in Various Fields of Application

Arxiv

0+阅读 · 2022年11月1日

Speech-text based multi-modal training with bidirectional attention for improved speech recognition

Arxiv

0+阅读 · 2022年11月1日

HDNet: Hierarchical Dynamic Network for Gait Recognition using Millimeter-Wave Radar

Arxiv

0+阅读 · 2022年11月1日

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition

Arxiv

0+阅读 · 2022年10月31日

Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition for Single and Multi-Person Video

Arxiv

0+阅读 · 2022年10月31日

ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention network

Arxiv

0+阅读 · 2022年10月31日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

State-of-the-art Models for Object Detection in Various Fields of Application

Arxiv

0+阅读 · 2022年11月1日

Speech-text based multi-modal training with bidirectional attention for improved speech recognition

Arxiv

0+阅读 · 2022年11月1日

HDNet: Hierarchical Dynamic Network for Gait Recognition using Millimeter-Wave Radar

Arxiv

0+阅读 · 2022年11月1日

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition

Arxiv

0+阅读 · 2022年10月31日

Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition for Single and Multi-Person Video

Arxiv

0+阅读 · 2022年10月31日

ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention network

Arxiv

0+阅读 · 2022年10月31日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

相关基金

重载车辆ECAS/CTIS集成系统耦合机理及主动控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于柚子型光子晶体光纤长周期光栅的痕量TNT爆炸物探测原理与方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

IMPDH为靶点的小分子抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2012年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

高速撞击流反应器内流动和微观混合特性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

乙肝病毒诱导FoxC1表达在促进肝癌转移中的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

野菊花中萜类和黄酮类化合物抗乙肝病毒活性协同作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

适应纳米尺度CMOS集成电路DFM的ULTRA模型完善和偏差模拟技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于EcR/USP结构的新型IGRs先导化合物的生物合理设计

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员