多渠道目标多渠道目标议长的提炼和精炼:WavLab提交第二次提高清晰度的挑战 (Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge) - 专知论文

会员服务 ·

0

开发集 · 语音增强 · 情景 · MoDELS · state-of-the-art ·

2023 年 2 月 15 日

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge

翻译：多渠道目标多渠道目标议长的提炼和精炼:WavLab提交第二次提高清晰度的挑战

Samuele Cornell,Zhong-Qiu Wang,Yoshiki Masuyama,Shinji Watanabe,Manuel Pariente,Nobutaka Ono

This paper describes our submission to the Second Clarity Enhancement Challenge (CEC2), which consists of target speech enhancement for hearing-aid (HA) devices in noisy-reverberant environments with multiple interferers such as music and competing speakers. Our approach builds upon the powerful iterative neural/beamforming enhancement (iNeuBe) framework introduced in our recent work, and this paper extends it for target speaker extraction. We therefore name the proposed approach as iNeuBe-X, where the X stands for extraction. To address the challenges encountered in the CEC2 setting, we introduce four major novelties: (1) we extend the state-of-the-art TF-GridNet model, originally designed for monaural speaker separation, for multi-channel, causal speech enhancement, and large improvements are observed by replacing the TCNDenseNet used in iNeuBe with this new architecture; (2) we leverage a recent dual window size approach with future-frame prediction to ensure that iNueBe-X satisfies the 5 ms constraint on algorithmic latency required by CEC2; (3) we introduce a novel speaker-conditioning branch for TF-GridNet to achieve target speaker extraction; (4) we propose a fine-tuning step, where we compute an additional loss with respect to the target speaker signal compensated with the listener audiogram. Without using external data, on the official development set our best model reaches a hearing-aid speech perception index (HASPI) score of 0.942 and a scale-invariant signal-to-distortion ratio improvement (SI-SDRi) of 18.8 dB. These results are promising given the fact that the CEC2 data is extremely challenging (e.g., on the development set the mixture SI-SDR is -12.3 dB). A demo of our submitted system is available at WAVLab CEC2 demo.

翻译：本文介绍我们提交第二提高清晰度挑战(CEC2)的情况,该挑战包括:为听力援助(HA)设备在噪音反动环境中增强语音装置的目标,包括音乐和相互竞争的演讲者等多个干扰器。我们的方法以我们最近工作中引入的强大的迭代神经/波形增强(iNeube)框架为基础,本文将其扩展为目标扬声器提取。因此,我们将拟议方法命名为iNeuBe-X,其中X代表提取。为了应对CEC2设置中遇到的挑战,我们推出了四大新颖:(1) 我们扩展了最先进的TF-GridNet模型,该模型最初设计用于音频分离,用于多频道、因果语音增强和大幅改进,通过替换iNeueube中使用的TCNDENet;(2) 我们利用最近的双窗口规模方法来进行未来框架预测,以确保 iNueB-X能够满足CEC2所要求的对算法变异度的5 ms 限制;(3) 我们引入了一个新的演讲者-CR-DR-DR 目标升级部门,其中显示我们最新的音频变换的SDRADR数据。

0

相关内容

开发集

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

全球首个GNN为主的AI创业公司，募资$18.5 million！

全球首个GNN为主的AI创业公司，募资$18.5 million！

图与推荐

1+阅读 · 2022年4月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

三维编织复合材料风扇叶片高速冲击下的力学行为研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

Actinophyllic Acid类含七元环的复杂多环活性天然产物全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于BIM的建筑生命周期环境与经济评价及优化设计方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

水稻亚种间新合成四倍体早期世代基因组变异

国家自然科学基金

0+阅读 · 2013年12月31日

HIV-1逆转录酶/整合酶双重抑制剂DKA-DAPYs的分子设计、合成及抗HIV活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cocycle动力学和拟周期薛定谔算子的谱

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

MAPK信号转导通路在PCB-153和p,p'-DDE致下丘脑-垂体-甲状腺轴功能紊乱中作用的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Novel Channel Model for Reconfigurable Intelligent Surfaces with Consideration of Polarization and Switch Impairments

A Novel Channel Model for Reconfigurable Intelligent Surfaces with Consideration of Polarization and Switch Impairments

Arxiv

0+阅读 · 2023年4月7日

Large-Scale Regional Traffic Signal Control Using Dynamic Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月7日

On the Limits of Cross-Authentication Checks for GNSS Signals

Arxiv

0+阅读 · 2023年4月6日

Both Efficiency and Effectiveness! A Large Scale Pre-ranking Framework in Search System

Arxiv

0+阅读 · 2023年4月5日

Relative Entropy-Based Waveform Optimization for Rician Target Detection with Dual-Function Radar Communication Systems

Arxiv

0+阅读 · 2023年4月5日

Adaptive Enhancement of Extreme Low-Light Images

Arxiv

0+阅读 · 2023年4月4日

Form-NLU: Dataset for the Form Language Understanding

Arxiv

0+阅读 · 2023年4月4日

Efficient human-in-loop deep learning model training with iterative refinement and statistical result validation

Arxiv

0+阅读 · 2023年4月3日

Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction

Arxiv

0+阅读 · 2023年4月3日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

美军小型无人机项目

无人机蜂群——作为执行非常规战争的创新工具 | 2025最新文献

不确定环境下无人机与无人地面车辆编队的地下勘探规划算法 | 122页

接纳无人机多样性：西方军事在无人机战争中适应的五个挑战 | 28页报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

全球首个GNN为主的AI创业公司，募资$18.5 million！

全球首个GNN为主的AI创业公司，募资$18.5 million！

图与推荐

1+阅读 · 2022年4月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

A Novel Channel Model for Reconfigurable Intelligent Surfaces with Consideration of Polarization and Switch Impairments

A Novel Channel Model for Reconfigurable Intelligent Surfaces with Consideration of Polarization and Switch Impairments

Arxiv

0+阅读 · 2023年4月7日

Large-Scale Regional Traffic Signal Control Using Dynamic Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月7日

On the Limits of Cross-Authentication Checks for GNSS Signals

Arxiv

0+阅读 · 2023年4月6日

Both Efficiency and Effectiveness! A Large Scale Pre-ranking Framework in Search System

Arxiv

0+阅读 · 2023年4月5日

Relative Entropy-Based Waveform Optimization for Rician Target Detection with Dual-Function Radar Communication Systems

Arxiv

0+阅读 · 2023年4月5日

Adaptive Enhancement of Extreme Low-Light Images

Arxiv

0+阅读 · 2023年4月4日

Form-NLU: Dataset for the Form Language Understanding

Arxiv

0+阅读 · 2023年4月4日

Efficient human-in-loop deep learning model training with iterative refinement and statistical result validation

Arxiv

0+阅读 · 2023年4月3日

Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction

Arxiv

0+阅读 · 2023年4月3日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

三维编织复合材料风扇叶片高速冲击下的力学行为研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

Actinophyllic Acid类含七元环的复杂多环活性天然产物全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于BIM的建筑生命周期环境与经济评价及优化设计方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

水稻亚种间新合成四倍体早期世代基因组变异

国家自然科学基金

0+阅读 · 2013年12月31日

HIV-1逆转录酶/整合酶双重抑制剂DKA-DAPYs的分子设计、合成及抗HIV活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cocycle动力学和拟周期薛定谔算子的谱

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

MAPK信号转导通路在PCB-153和p,p'-DDE致下丘脑-垂体-甲状腺轴功能紊乱中作用的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员