活性音量检测作为多目标优化,具有不确定性的多模式融合 (Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion) - 专知论文

会员服务 ·

0

多峰值 · 模态 · 优化器 · state-of-the-art · 学成 ·

2021 年 9 月 15 日

Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion

翻译：活性音量检测作为多目标优化,具有不确定性的多模式融合

Baptiste Pouthier,Laurent Pilati,Leela K. Gudupudi,Charles Bouveyron,Frederic Precioso

from arxiv, In INTERSPEECH 2021

It is now well established from a variety of studies that there is a significant benefit from combining video and audio data in detecting active speakers. However, either of the modalities can potentially mislead audiovisual fusion by inducing unreliable or deceptive information. This paper outlines active speaker detection as a multi-objective learning problem to leverage best of each modalities using a novel self-attention, uncertainty-based multimodal fusion scheme. Results obtained show that the proposed multi-objective learning architecture outperforms traditional approaches in improving both mAP and AUC scores. We further demonstrate that our fusion strategy surpasses, in active speaker detection, other modality fusion methods reported in various disciplines. We finally show that the proposed method significantly improves the state-of-the-art on the AVA-ActiveSpeaker dataset.

翻译：现已从各种研究中确定,将视频和音频数据相结合对探测活跃的发言者有很大好处,然而,这两种模式中的任何一种都有可能诱导不可靠或欺骗性信息,从而误导视听融合,本文概述了积极演讲者发现是一个多目标学习问题,目的是利用一种全新的自我意识、基于不确定性的多式联运融合计划,最佳利用每一种模式。获得的结果表明,拟议的多目标学习结构在改进 mAP 和 AUC 分数方面优于传统方法。我们进一步表明,我们的聚合战略在积极语音检测方面超过了在不同学科中报告的其他模式融合方法。我们最后表明,拟议的方法大大改进了AVA-A-ApentSpeaker数据集的最新技术。

0

相关内容

多峰值

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

专知会员服务

46+阅读 · 2019年12月25日

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

专知会员服务

45+阅读 · 2019年11月26日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【ECML-PKDD 2019】从AIS数据中发现隐藏的概念:一种用于异常检测的海上交通网络抽象（Uncovering hidden concepts from AIS data: A network abstraction of maritime traffic for anomaly detection）

【ECML-PKDD 2019】从AIS数据中发现隐藏的概念:一种用于异常检测的海上交通网络抽象（Uncovering hidden concepts from AIS data: A network abstraction of maritime traffic for anomaly detection）

专知会员服务

22+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

专知

8+阅读 · 2018年2月5日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Neural Network Based Epileptic EEG Detection and Classification

Arxiv

0+阅读 · 2021年11月5日

Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Lexical Information Fusion

Arxiv

0+阅读 · 2021年11月4日

Cross modal video representations for weakly supervised active speaker localization

Arxiv

0+阅读 · 2021年11月3日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Arxiv

3+阅读 · 2021年1月29日

A Simple and Effective Self-Supervised Contrastive Learning Framework for Aspect Detection

Arxiv

9+阅读 · 2020年12月31日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Active Generative Adversarial Network for Image Classification

Arxiv

4+阅读 · 2019年6月17日

Domain Specific Approximation for Object Detection

Arxiv

5+阅读 · 2018年10月4日

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Arxiv

4+阅读 · 2018年7月4日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

专知会员服务

46+阅读 · 2019年12月25日

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

专知会员服务

45+阅读 · 2019年11月26日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【ECML-PKDD 2019】从AIS数据中发现隐藏的概念:一种用于异常检测的海上交通网络抽象（Uncovering hidden concepts from AIS data: A network abstraction of maritime traffic for anomaly detection）

【ECML-PKDD 2019】从AIS数据中发现隐藏的概念:一种用于异常检测的海上交通网络抽象（Uncovering hidden concepts from AIS data: A network abstraction of maritime traffic for anomaly detection）

专知会员服务

22+阅读 · 2019年9月16日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】用于提升含优化层学习的算法与体系结构

【NeurIPS2025】有何不同于过去？基于自监督偏差学习的时空时间序列预测

超越决策优势：情报在创新与适应中的作用

量子计算发展态势研究报告（2025年）

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

专知

8+阅读 · 2018年2月5日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Neural Network Based Epileptic EEG Detection and Classification

Arxiv

0+阅读 · 2021年11月5日

Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Lexical Information Fusion

Arxiv

0+阅读 · 2021年11月4日

Cross modal video representations for weakly supervised active speaker localization

Arxiv

0+阅读 · 2021年11月3日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Arxiv

3+阅读 · 2021年1月29日

A Simple and Effective Self-Supervised Contrastive Learning Framework for Aspect Detection

Arxiv

9+阅读 · 2020年12月31日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Active Generative Adversarial Network for Image Classification

Arxiv

4+阅读 · 2019年6月17日

Domain Specific Approximation for Object Detection

Arxiv

5+阅读 · 2018年10月4日

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Arxiv

4+阅读 · 2018年7月4日

微信扫码咨询专知VIP会员