AVATAR 向Ego4D AV 输电传输系统挑战提交 (AVATAR submission to the Ego4D AV Transcription Challenge) - 专知论文

会员服务 ·

0

情景 · Performer · 基准 · MoDELS · 模式识别 ·

2022 年 11 月 18 日

AVATAR submission to the Ego4D AV Transcription Challenge

翻译：AVATAR 向Ego4D AV 输电传输系统挑战提交

Paul Hongsuck Seo,Arsha Nagrani,Cordelia Schmid

In this report, we describe our submission to the Ego4D AudioVisual (AV) Speech Transcription Challenge 2022. Our pipeline is based on AVATAR, a state of the art encoder-decoder model for AV-ASR that performs early fusion of spectrograms and RGB images. We describe the datasets, experimental settings and ablations. Our final method achieves a WER of 68.40 on the challenge test set, outperforming the baseline by 43.7%, and winning the challenge.

翻译：在本报告中,我们描述了我们向Ego4D视听语音传输(AV)2022挑战提交的呈件,我们的输油管以AVATAR为基础,AVATAR是AV-ASR最先进的编码器解码器模型,它早期结合了光谱和RGB图像,我们描述了数据集、实验设置和稀释,我们的最后方法在挑战测试上达到了68.40的WER,比基准高出43.7%,并赢得了挑战。

0

相关内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

水稻OsSRO1c基因在干旱胁迫诱导的叶片衰老中的功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

棉花中一个成花素同源基因GhFTL1调节开花的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

深海细菌Pseudomonas marincola的氧化还原酶高通量挖掘、特异性研究与应用

国家自然科学基金

0+阅读 · 2011年12月31日

拟南芥开花时间调控基因的克隆与功能分析

国家自然科学基金

0+阅读 · 2011年12月31日

Reliable amortized variational inference with physics-based latent distribution correction

Reliable amortized variational inference with physics-based latent distribution correction

Arxiv

0+阅读 · 2023年1月18日

Supporting Secure Dynamic Alert Zones Using Searchable Encryption and Graph Embedding

Arxiv

0+阅读 · 2023年1月16日

Handling Bias in Toxic Speech Detection: A Survey

Arxiv

0+阅读 · 2023年1月15日

Training one model to detect heart and lung sound events from single point auscultations

Arxiv

0+阅读 · 2023年1月15日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

《北约认知战概念报告》

【MIT博士论文】高效的视觉合成生成模型

美海军放弃星座级转而采用国家安全巡逻舰设计

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

相关论文

Reliable amortized variational inference with physics-based latent distribution correction

Reliable amortized variational inference with physics-based latent distribution correction

Arxiv

0+阅读 · 2023年1月18日

Supporting Secure Dynamic Alert Zones Using Searchable Encryption and Graph Embedding

Arxiv

0+阅读 · 2023年1月16日

Handling Bias in Toxic Speech Detection: A Survey

Arxiv

0+阅读 · 2023年1月15日

Training one model to detect heart and lung sound events from single point auscultations

Arxiv

0+阅读 · 2023年1月15日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

相关基金

水稻OsSRO1c基因在干旱胁迫诱导的叶片衰老中的功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

棉花中一个成花素同源基因GhFTL1调节开花的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

深海细菌Pseudomonas marincola的氧化还原酶高通量挖掘、特异性研究与应用

国家自然科学基金

0+阅读 · 2011年12月31日

拟南芥开花时间调控基因的克隆与功能分析

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员