提高目标演讲的力度的战略 (Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations) - 专知论文

会员服务 ·

0

稳健性 · Performer · Performance · CASES · 优化器 ·

2022 年 6 月 16 日

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations

翻译：提高目标演讲的力度的战略

Hiroshi Sato,Tsubasa Ochiai,Marc Delcroix,Keisuke Kinoshita,Takafumi Moriya,Naoki Makishima,Mana Ihori,Tomohiro Tanaka,Ryo Masumura

from arxiv, 5 pages, 2 figures, 3 tables Submitted to Interspeech 2022

Target speech extraction is a technique to extract the target speaker's voice from mixture signals using a pre-recorded enrollment utterance that characterize the voice characteristics of the target speaker. One major difficulty of target speech extraction lies in handling variability in ``intra-speaker'' characteristics, i.e., characteristics mismatch between target speech and an enrollment utterance. While most conventional approaches focus on improving {\it average performance} given a set of enrollment utterances, here we propose to guarantee the {\it worst performance}, which we believe is of great practical importance. In this work, we propose an evaluation metric called worst-enrollment source-to-distortion ratio (SDR) to quantitatively measure the robustness towards enrollment variations. We also introduce a novel training scheme that aims at directly optimizing the worst-case performance by focusing on training with difficult enrollment cases where extraction does not perform well. In addition, we investigate the effectiveness of auxiliary speaker identification loss (SI-loss) as another way to improve robustness over enrollments. Experimental validation reveals the effectiveness of both worst-enrollment target training and SI-loss training to improve robustness against enrollment variations, by increasing speaker discriminability.

翻译：目标语音提取是一种技术,用预先录制的注册语句从混音信号中提取目标演讲者的声音,这是目标演讲者声音特点的特点。目标语音提取的一个主要困难在于处理“内播音者”特征的变异性,即目标演讲和录制语句之间的特征不匹配。虽然大多数常规方法侧重于改进平均性能,但考虑到一系列的注册语句,我们在这里建议保证最差的性能,我们认为这在实际中非常重要。在这项工作中,我们提议了一个评价指标,称为最差的加压源对扭曲率(SDR),以量化衡量对招录变化的稳健性。我们还采用了一个新的培训计划,目的是直接优化最坏的性能,重点是在难于录用的情况下进行培训。此外,我们调查辅助演讲者识别损失(SI-loss)的功效,作为提高招生能力的另一个方法。实验性验证显示最差的升级目标培训和SI-loss培训的有效性,以便通过提高演讲者对可接受性变的稳性。

0

相关内容

稳健性

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

专知会员服务

20+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

基于天然产物Aspernigerin的新型几丁质合成抑制剂的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

多层积雪相干散射模型与SAR参数反演

国家自然科学基金

0+阅读 · 2014年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

具有抗肿瘤活性天然产物Marmycin A的全合成、衍生化及生物活性研究

国家自然科学基金

0+阅读 · 2012年12月31日

壳聚糖-二氧化硅药物缓释涂层的制备及其生物活性

国家自然科学基金

0+阅读 · 2011年12月31日

基于C-PolInSAR和PolInSAR的森林垂直结构参数反演

国家自然科学基金

0+阅读 · 2009年12月31日

X和Ku波段雷达观测在自然积雪场景下的建模与分析

国家自然科学基金

0+阅读 · 2009年12月31日

基于小波有限元的探地雷达正演模拟及偏移处理

国家自然科学基金

0+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

Arxiv

0+阅读 · 2022年8月4日

Is current research on adversarial robustness addressing the right problem?

Arxiv

0+阅读 · 2022年8月4日

Convolutional Fine-Grained Classification with Self-Supervised Target Relation Regularization

Arxiv

0+阅读 · 2022年8月3日

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift

Arxiv

0+阅读 · 2022年8月3日

Contrastive Triple Extraction with Generative Transformer

Arxiv

13+阅读 · 2021年2月4日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Arxiv

15+阅读 · 2018年5月24日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

专知会员服务

20+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

相关论文

The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

Arxiv

0+阅读 · 2022年8月4日

Is current research on adversarial robustness addressing the right problem?

Arxiv

0+阅读 · 2022年8月4日

Convolutional Fine-Grained Classification with Self-Supervised Target Relation Regularization

Arxiv

0+阅读 · 2022年8月3日

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift

Arxiv

0+阅读 · 2022年8月3日

Contrastive Triple Extraction with Generative Transformer

Arxiv

13+阅读 · 2021年2月4日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Arxiv

15+阅读 · 2018年5月24日

相关基金

基于天然产物Aspernigerin的新型几丁质合成抑制剂的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

多层积雪相干散射模型与SAR参数反演

国家自然科学基金

0+阅读 · 2014年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

具有抗肿瘤活性天然产物Marmycin A的全合成、衍生化及生物活性研究

国家自然科学基金

0+阅读 · 2012年12月31日

壳聚糖-二氧化硅药物缓释涂层的制备及其生物活性

国家自然科学基金

0+阅读 · 2011年12月31日

基于C-PolInSAR和PolInSAR的森林垂直结构参数反演

国家自然科学基金

0+阅读 · 2009年12月31日

X和Ku波段雷达观测在自然积雪场景下的建模与分析

国家自然科学基金

0+阅读 · 2009年12月31日

基于小波有限元的探地雷达正演模拟及偏移处理

国家自然科学基金

0+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员