TS-RIR: 翻译合成室超声波扩增的脉冲反应 (TS-RIR: Translated synthetic room impulse responses for speech augmentation) - 专知论文

会员服务 ·

0

Performer · 数据集增强 · 可约的 · 语音识别 · 自动语音识别 ·

2021 年 11 月 10 日

TS-RIR: Translated synthetic room impulse responses for speech augmentation

翻译：TS-RIR: 翻译合成室超声波扩增的脉冲反应

Anton Ratnarajah,Zhenyu Tang,Dinesh Manocha

from arxiv, Accepted to IEEE ASRU 2021. Source code is available at https://github.com/GAMMA-UMD/TS-RIR

We present a method for improving the quality of synthetic room impulse responses for far-field speech recognition. We bridge the gap between the fidelity of synthetic room impulse responses (RIRs) and the real room impulse responses using our novel, TS-RIRGAN architecture. Given a synthetic RIR in the form of raw audio, we use TS-RIRGAN to translate it into a real RIR. We also perform real-world sub-band room equalization on the translated synthetic RIR. Our overall approach improves the quality of synthetic RIRs by compensating low-frequency wave effects, similar to those in real RIRs. We evaluate the performance of improved synthetic RIRs on a far-field speech dataset augmented by convolving the LibriSpeech clean speech dataset [1] with RIRs and adding background noise. We show that far-field speech augmented using our improved synthetic RIRs reduces the word error rate by up to 19.9% in Kaldi far-field automatic speech recognition benchmark [2].

翻译：我们提出了一个提高合成室脉冲反应质量的方法,用于远方语音识别。我们用我们的新颖的TS-RIRGAN结构弥合合成室脉冲反应(RIRs)与真实室脉冲反应(RIRs)真实室脉冲反应之间差距。鉴于合成RIR(合成RIR)的形式为原始音频,我们使用TS-RIRGAN(合成RIR)将其转化为真正的RIR。我们还在合成RIR(翻译合成RIR)上实现了现实世界次带室的均衡。我们的总体方法通过补偿低频波效应(与真实RIRs相似)来提高合成RIRs的质量。我们评估了合成RIRs在远方语音数据集上的改进合成RIRs(与RIRs(LibriSpeech)清洁语音数据集[1]和添加背景噪音,从而强化了远方语音数据数据集的功能[2]。我们显示,远方话用改进的合成RIRs(合成RIRs)将字错率降低到19.9%的卡迪远方自动语音识别识别基准[2]。

0

相关内容

Performer

【ICML2021】学会用长序列记忆来排练

专知会员服务

16+阅读 · 2021年6月4日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【神经网络数学的初学者指南】（A Beginner’s Guide to the Mathematics of Neural Networks），伦敦国王学院数学系教授| A. C. C. Coolen

【神经网络数学的初学者指南】（A Beginner’s Guide to the Mathematics of Neural Networks），伦敦国王学院数学系教授| A. C. C. Coolen

专知会员服务

55+阅读 · 2019年12月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

机器翻译 | Bleu：此蓝;非彼蓝

机器翻译 | Bleu：此蓝;非彼蓝

黑龙江大学自然语言处理实验室

4+阅读 · 2018年3月14日

【论文推荐】最新7篇变分自编码器（VAE）相关论文—汉语诗歌、生成模型、跨模态、MR图像重建、机器翻译、推断、合成人脸

【论文推荐】最新7篇变分自编码器（VAE）相关论文—汉语诗歌、生成模型、跨模态、MR图像重建、机器翻译、推断、合成人脸

专知

11+阅读 · 2018年2月12日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

PaperWeekly

5+阅读 · 2017年7月19日

Investigation of Data Augmentation Techniques for Disordered Speech Recognition

Arxiv

0+阅读 · 2022年1月14日

Speech Resources in the Tamasheq Language

Arxiv

0+阅读 · 2022年1月13日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

A Comparative Study on Transformer vs RNN in Speech Applications

A Comparative Study on Transformer vs RNN in Speech Applications

Arxiv

4+阅读 · 2019年9月13日

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Arxiv

5+阅读 · 2019年2月8日

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Arxiv

4+阅读 · 2019年1月18日

Data Augmentation of Room Classifiers using Generative Adversarial Networks

Data Augmentation of Room Classifiers using Generative Adversarial Networks

Arxiv

4+阅读 · 2019年1月10日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Sem-GAN: Semantically-Consistent Image-to-Image Translation

Sem-GAN: Semantically-Consistent Image-to-Image Translation

Arxiv

4+阅读 · 2018年7月12日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

9+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

数据集增强

自动语音识别

相关VIP内容

【ICML2021】学会用长序列记忆来排练

专知会员服务

16+阅读 · 2021年6月4日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【神经网络数学的初学者指南】（A Beginner’s Guide to the Mathematics of Neural Networks），伦敦国王学院数学系教授| A. C. C. Coolen

【神经网络数学的初学者指南】（A Beginner’s Guide to the Mathematics of Neural Networks），伦敦国王学院数学系教授| A. C. C. Coolen

专知会员服务

55+阅读 · 2019年12月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

军事战术边缘计算的重要性

《欧洲天空盾牌倡议：应对无人机饱和攻击与高超音速导弹的多层防空演进与挑战》报告

《美军使用大语言模型技术生成领域特定文档》2025最新379页

《代理生成式人工智能与国家安全：提升竞争力的政策建议》

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

机器翻译 | Bleu：此蓝;非彼蓝

机器翻译 | Bleu：此蓝;非彼蓝

黑龙江大学自然语言处理实验室

4+阅读 · 2018年3月14日

【论文推荐】最新7篇变分自编码器（VAE）相关论文—汉语诗歌、生成模型、跨模态、MR图像重建、机器翻译、推断、合成人脸

【论文推荐】最新7篇变分自编码器（VAE）相关论文—汉语诗歌、生成模型、跨模态、MR图像重建、机器翻译、推断、合成人脸

专知

11+阅读 · 2018年2月12日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

PaperWeekly

5+阅读 · 2017年7月19日

相关论文

Investigation of Data Augmentation Techniques for Disordered Speech Recognition

Arxiv

0+阅读 · 2022年1月14日

Speech Resources in the Tamasheq Language

Arxiv

0+阅读 · 2022年1月13日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

A Comparative Study on Transformer vs RNN in Speech Applications

A Comparative Study on Transformer vs RNN in Speech Applications

Arxiv

4+阅读 · 2019年9月13日

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Arxiv

5+阅读 · 2019年2月8日

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Arxiv

4+阅读 · 2019年1月18日

Data Augmentation of Room Classifiers using Generative Adversarial Networks

Data Augmentation of Room Classifiers using Generative Adversarial Networks

Arxiv

4+阅读 · 2019年1月10日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Sem-GAN: Semantically-Consistent Image-to-Image Translation

Sem-GAN: Semantically-Consistent Image-to-Image Translation

Arxiv

4+阅读 · 2018年7月12日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

9+阅读 · 2018年3月14日

微信扫码咨询专知VIP会员