在多贩运人口环境中,为自动语音识别而提高认知意识,以在多语言环境中进行自动语音识别 (Streaming Noise Context Aware Enhancement For Automatic Speech Recognition in Multi-Talker Environments) - 专知论文

会员服务 ·

0

噪声 · 自动语音识别 · 流 · 回合 · 语音识别 ·

2022 年 5 月 17 日

Streaming Noise Context Aware Enhancement For Automatic Speech Recognition in Multi-Talker Environments

翻译：在多贩运人口环境中,为自动语音识别而提高认知意识,以在多语言环境中进行自动语音识别

Joe Caroselli,Arun Narayanan,Yiteng Huang

from arxiv, Submitted to IWAENC 2022

One of the most challenging scenarios for smart speakers is multi-talker, when target speech from the desired speaker is mixed with interfering speech from one or more speakers. A smart assistant needs to determine which voice to recognize and which to ignore and it needs to do so in a streaming, low-latency manner. This work presents two multi-microphone speech enhancement algorithms targeted at this scenario. Targeting on-device use-cases, we assume that the algorithm has access to the signal before the hotword, which is referred to as the noise context. First is the Context Aware Beamformer which uses the noise context and detected hotword to determine how to target the desired speaker. The second is an adaptive noise cancellation algorithm called Speech Cleaner which trains a filter using the noise context. It is demonstrated that the two algorithms are complementary in the signal-to-noise ratio conditions under which they work well. We also propose an algorithm to select which one to use based on estimated SNR. When using 3 microphone channels, the final system achieves a relative word error rate reduction of 55% at -12dB, and 43\% at 12dB.

翻译：对于聪明的发言者来说,最具挑战性的情景之一是多讲台,当想要的发言者的目标演讲与一位或多位发言者的干扰性演讲混在一起时,一个聪明的助理需要确定哪些声音需要识别,哪些需要忽略,需要以流态、低纬度的方式这样做。这项工作提出了针对这一情景的两种多声语音增强算法。在设计设备使用的情况下,我们假设算法可以进入热词前的信号,即噪音背景。首先,了解环境的信号显示,使用噪音背景和探测到的热词来确定如何瞄准想要的发言者。第二个是适应性噪音取消算法,称为“语音清洁”,用噪音背景来训练过滤器。这证明两种算法在信号到噪音比率条件下是相辅相成的。我们还提议一种算法,根据估计的SNR使用哪种算法。当使用3个麦克风频道时,最后的系统在-12dB和12dB上将相对字差率减少55%和43 ⁇ 。

0

相关内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

铁基双金属/石墨烯的制备及其吸附与可见光Fenton降解染料的性能和机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

平稳相依空间数据下基于经验似然的非参数统计推断

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

一类拟线性Schrodinger方程(组)解的存在性和集中现象研究

国家自然科学基金

0+阅读 · 2012年12月31日

无线信道统计复用基础理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年7月8日

Learning Sequential Descriptors for Sequence-based Visual Place Recognition

Learning Sequential Descriptors for Sequence-based Visual Place Recognition

Arxiv

0+阅读 · 2022年7月8日

An Efficiency Study for SPLADE Models

Arxiv

0+阅读 · 2022年7月8日

$Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement$

Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement

Arxiv

0+阅读 · 2022年7月7日

What Makes for Automatic Reconstruction of Pulmonary Segments

What Makes for Automatic Reconstruction of Pulmonary Segments

Arxiv

0+阅读 · 2022年7月7日

A Large Scale Search Dataset for Unbiased Learning to Rank

A Large Scale Search Dataset for Unbiased Learning to Rank

Arxiv

0+阅读 · 2022年7月7日

A Study on Robustness to Perturbations for Representations of Environmental Sound

Arxiv

0+阅读 · 2022年7月6日

Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands

Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands

Arxiv

0+阅读 · 2022年7月6日

Context Sensing Attention Network for Video-based Person Re-identification

Context Sensing Attention Network for Video-based Person Re-identification

Arxiv

0+阅读 · 2022年7月6日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ACMMM2025教程】打击网络虚假信息视频：特征分析、检测与防范，170页ppt

海军无人系统：海上作战的演进而非革命

Nature 子刊 | SciToolAgent:知识图谱引导的科学工具智能体

多媒体顶会ACM Multimedia 2025各大奖项揭晓！格拉斯哥大学等获最佳论文，中科院自动化所等获最佳学生论文

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年7月8日

Learning Sequential Descriptors for Sequence-based Visual Place Recognition

Learning Sequential Descriptors for Sequence-based Visual Place Recognition

Arxiv

0+阅读 · 2022年7月8日

An Efficiency Study for SPLADE Models

Arxiv

0+阅读 · 2022年7月8日

$Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement$

Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement

Arxiv

0+阅读 · 2022年7月7日

What Makes for Automatic Reconstruction of Pulmonary Segments

What Makes for Automatic Reconstruction of Pulmonary Segments

Arxiv

0+阅读 · 2022年7月7日

A Large Scale Search Dataset for Unbiased Learning to Rank

A Large Scale Search Dataset for Unbiased Learning to Rank

Arxiv

0+阅读 · 2022年7月7日

A Study on Robustness to Perturbations for Representations of Environmental Sound

Arxiv

0+阅读 · 2022年7月6日

Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands

Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands

Arxiv

0+阅读 · 2022年7月6日

Context Sensing Attention Network for Video-based Person Re-identification

Context Sensing Attention Network for Video-based Person Re-identification

Arxiv

0+阅读 · 2022年7月6日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

相关基金

铁基双金属/石墨烯的制备及其吸附与可见光Fenton降解染料的性能和机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

平稳相依空间数据下基于经验似然的非参数统计推断

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

一类拟线性Schrodinger方程(组)解的存在性和集中现象研究

国家自然科学基金

0+阅读 · 2012年12月31日

无线信道统计复用基础理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员