基于传播网的中流噪音环境语音活动探测 (Voice Activity Detection for Transient Noisy Environment Based on Diffusion Nets) - 专知论文

会员服务 ·

0

Performer · 回合 · 模型评估 · Neural Networks · Networking ·

2021 年 6 月 25 日

Voice Activity Detection for Transient Noisy Environment Based on Diffusion Nets

翻译：基于传播网的中流噪音环境语音活动探测

Amir Ivry,Baruch Berdugo,Israel Cohen

from arxiv, Accepted to IEEE journal of selected topics in signal processing 2019

We address voice activity detection in acoustic environments of transients and stationary noises, which often occur in real life scenarios. We exploit unique spatial patterns of speech and non-speech audio frames by independently learning their underlying geometric structure. This process is done through a deep encoder-decoder based neural network architecture. This structure involves an encoder that maps spectral features with temporal information to their low-dimensional representations, which are generated by applying the diffusion maps method. The encoder feeds a decoder that maps the embedded data back into the high-dimensional space. A deep neural network, which is trained to separate speech from non-speech frames, is obtained by concatenating the decoder to the encoder, resembling the known Diffusion nets architecture. Experimental results show enhanced performance compared to competing voice activity detection methods. The improvement is achieved in both accuracy, robustness and generalization ability. Our model performs in a real-time manner and can be integrated into audio-based communication systems. We also present a batch algorithm which obtains an even higher accuracy for off-line applications.

翻译：我们在现实生活中经常出现的瞬态和静止噪音的声学环境中进行语音活动探测,我们通过独立地学习其基本几何结构,利用语言和非声音音框的独特空间空间模式。这一过程是通过一个基于神经网络结构的深编码器进行的。这个结构涉及一个编码器,该编码器将光谱特征与时间信息映射到其低维表示器上,这些显示器是应用扩散地图方法产生的。编码器输入一个解码器,将嵌入的数据映射回高维空间。一个深神经网络,经过培训,可以将语音和非声音框架分开,通过将解码器与已知的Difculp 网结构相融合而获得。实验结果显示,与相互竞争的语音活动探测方法相比,其性能有所提高。在精确度、稳健度和一般化能力方面都取得了改进。我们的模型以实时方式运行,并可以纳入基于声音的通信系统。我们还提出了一个批量算法,它获得了离线应用程序的更精确性。

0

相关内容

Performer

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

专知会员服务

15+阅读 · 2019年11月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

视频目标检测：Flow-based

视频目标检测：Flow-based

极市平台

22+阅读 · 2019年5月27日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

时序数据异常检测工具/数据集大列表

时序数据异常检测工具/数据集大列表

极市平台

65+阅读 · 2019年2月23日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

EarGate: Gait-based User Identification with In-ear Microphones

Arxiv

0+阅读 · 2021年8月27日

Passivity preserving model reduction via spectral factorization

Arxiv

0+阅读 · 2021年8月27日

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

Arxiv

0+阅读 · 2021年8月26日

Vision-based Autonomous Disinfection of High Touch Surfaces in Indoor Environments

Arxiv

0+阅读 · 2021年8月25日

Automated Environmental Monitoring Intelligent System Based on Compact Autonomous Robots for The Sevastopol Bay

Arxiv

1+阅读 · 2021年8月25日

Activity Recognition for Autism Diagnosis

Arxiv

0+阅读 · 2021年8月25日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Progressive Sparse Local Attention for Video object detection

Arxiv

4+阅读 · 2019年3月21日

Apple Flower Detection using Deep Convolutional Networks

Arxiv

3+阅读 · 2018年9月17日

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

Arxiv

4+阅读 · 2017年12月19日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

专知会员服务

15+阅读 · 2019年11月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

视频目标检测：Flow-based

视频目标检测：Flow-based

极市平台

22+阅读 · 2019年5月27日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

时序数据异常检测工具/数据集大列表

时序数据异常检测工具/数据集大列表

极市平台

65+阅读 · 2019年2月23日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

EarGate: Gait-based User Identification with In-ear Microphones

Arxiv

0+阅读 · 2021年8月27日

Passivity preserving model reduction via spectral factorization

Arxiv

0+阅读 · 2021年8月27日

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

Arxiv

0+阅读 · 2021年8月26日

Vision-based Autonomous Disinfection of High Touch Surfaces in Indoor Environments

Arxiv

0+阅读 · 2021年8月25日

Automated Environmental Monitoring Intelligent System Based on Compact Autonomous Robots for The Sevastopol Bay

Arxiv

1+阅读 · 2021年8月25日

Activity Recognition for Autism Diagnosis

Arxiv

0+阅读 · 2021年8月25日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Progressive Sparse Local Attention for Video object detection

Arxiv

4+阅读 · 2019年3月21日

Apple Flower Detection using Deep Convolutional Networks

Arxiv

3+阅读 · 2018年9月17日

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

Arxiv

4+阅读 · 2017年12月19日

微信扫码咨询专知VIP会员