De-STT: 使用反排斥的遗忘来消除在对文本系统讲话中的不想要的扰动和偏见 (De-STT: De-entaglement of unwanted Nuisances and Biases in Speech to Text System using Adversarial Forgetting) - 专知论文

会员服务 ·

0

INFORMS · 有偏 · 训练集 · MoDELS · 学成 ·

2021 年 1 月 30 日

De-STT: De-entaglement of unwanted Nuisances and Biases in Speech to Text System using Adversarial Forgetting

翻译：De-STT: 使用反排斥的遗忘来消除在对文本系统讲话中的不想要的扰动和偏见

Hemant Yadav,Janvijay Singh,Atul Anshuman Singh,Rachit Mittal,Rajiv Ratn Shah

from arxiv, 7 pages, 2 figures, 3 tables

Training a robust Speech to Text (STT) system requires tens of thousands of hours of data. Variabilities present in the dataset such as unwanted nuisances (environmental noise, etc) and biases (accent, gender, age, etc) are reasons for the need of large datasets to learn general representations, which is often not feasible for low resource languages. In many computer vision tasks, a recently proposed adversarial forgetting approach to remove unwanted features has produced good results. This motivates us to study the effect of de-entangling the accent information from the input speech signal while training STT systems. To this end, we use an information bottleneck architecture based on adversarial forgetting. This training scheme aims to enforce the model to learn general accent invariant speech representations. Two STT models trained on just 20 hrs of audio, with and without adversarial forgetting, are tested on two unseen accents not present in the training set. The results favour the adversarial forgetting scheme with an absolute average improvement of 6\% over the standard training scheme. Furthermore, we also observe an absolute improvement of 5.5\% when tested on the seen accents present in the training set.

翻译：强力的文字演讲(STT)系统需要数万小时的数据。数据集中存在的不想要的干扰(环境噪音等)和偏见(强烈、性别、年龄等)等差异是需要大型数据集来学习一般表达方式的原因,而对于低资源语言来说,这些数据往往不可行。在许多计算机愿景任务中,最近提出的消除不想要的特征的对抗式忘却方法产生了良好的结果。这促使我们在培训STT系统时研究从输入演讲信号中脱钩口音信息的效果。为此,我们使用基于对抗性遗忘的信息瓶颈结构。本培训计划的目的是执行学习一般口头表达方式的模式。仅用20小时的音频、有对抗性遗忘或不对抗性遗忘方式培训的两个STT模型在培训组合中没有出现的两种看不见的口音上进行了测试。结果有利于对抗性遗忘计划,比标准培训计划得到绝对平均6 ⁇ 的改进。此外,我们还观察到,在对培训组合中显示的口音进行测试时,5.5 ⁇ 的绝对改进。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning

专知会员服务

13+阅读 · 2020年2月24日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

12+阅读 · 2020年2月23日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Fine-tuning of Pre-trained End-to-end Speech Recognition with Generative Adversarial Networks

Arxiv

0+阅读 · 2021年3月10日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Arxiv

3+阅读 · 2020年6月9日

Text Summarization with Pretrained Encoders

Arxiv

5+阅读 · 2019年8月22日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Multi-Task Deep Neural Networks for Natural Language Understanding

Multi-Task Deep Neural Networks for Natural Language Understanding

Arxiv

3+阅读 · 2019年1月31日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Hierarchical Generative Modeling for Controllable Speech Synthesis

Hierarchical Generative Modeling for Controllable Speech Synthesis

Arxiv

3+阅读 · 2018年12月27日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning

专知会员服务

13+阅读 · 2020年2月24日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

12+阅读 · 2020年2月23日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

Apium加入红猫未来计划：推进战术无人机集群自主技术

《美陆军训练条令：反小型无人机系统（C-sUAS）炮术项目》2025最新80页

无人机如何改变战争？未来战场

《超大城市作战艺术》52页报告

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Fine-tuning of Pre-trained End-to-end Speech Recognition with Generative Adversarial Networks

Arxiv

0+阅读 · 2021年3月10日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Arxiv

3+阅读 · 2020年6月9日

Text Summarization with Pretrained Encoders

Arxiv

5+阅读 · 2019年8月22日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Multi-Task Deep Neural Networks for Natural Language Understanding

Multi-Task Deep Neural Networks for Natural Language Understanding

Arxiv

3+阅读 · 2019年1月31日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Hierarchical Generative Modeling for Controllable Speech Synthesis

Hierarchical Generative Modeling for Controllable Speech Synthesis

Arxiv

3+阅读 · 2018年12月27日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员