分散的时事代表对非侵入性初级语言的感知性预测 (Non-Intrusive Binaural Speech Intelligibility Prediction from Discrete Latent Representations) - 专知论文

会员服务 ·

0

INFORMS · 各向同性 · 离散化 · 噪声 · 泛函 ·

2021 年 11 月 24 日

Non-Intrusive Binaural Speech Intelligibility Prediction from Discrete Latent Representations

翻译：分散的时事代表对非侵入性初级语言的感知性预测

Alex F. McKinney,Benjamin Cauchi

from arxiv, 4 pages + 1 refs; 1 figure; submitted to IEEE SPL (pending review)

Non-intrusive speech intelligibility (SI) prediction from binaural signals is useful in many applications. However, most existing signal-based measures are designed to be applied to single-channel signals. Measures specifically designed to take into account the binaural properties of the signal are often intrusive - characterised by requiring access to a clean speech signal - and typically rely on combining both channels into a single-channel signal before making predictions. This paper proposes a non-intrusive SI measure that computes features from a binaural input signal using a combination of vector quantization (VQ) and contrastive predictive coding (CPC) methods. VQ-CPC feature extraction does not rely on any model of the auditory system and is instead trained to maximise the mutual information between the input signal and output features. The computed VQ-CPC features are input to a predicting function parameterized by a neural network. Two predicting functions are considered in this paper. Both feature extractor and predicting functions are trained on simulated binaural signals with isotropic noise. They are tested on simulated signals with isotropic and real noise. For all signals, the ground truth scores are the (intrusive) deterministic binaural STOI. Results are presented in terms of correlations and MSE and demonstrate that VQ-CPC features are able to capture information relevant to modelling SI and outperform all the considered benchmarks - even when evaluating on data comprising of different noise field types.

翻译：从二进制信号中进行无侵扰性言语感知性(SI)预测在许多应用中是有用的。然而,大多数现有基于信号的措施设计成适用于单声道信号的信号。专门设计考虑到信号二进制特性的措施往往具有侵扰性,其特点是需要获得清洁言语信号,通常依靠将两个渠道结合成单一声道信号,然后才作出预测。本文件建议采用一种非侵扰性SI测量,用矢量度和对比性预测编码(CPC)方法组合计算二进制输入信号的特征。VQ-CPC特征提取不依赖任何听觉系统模型,而是经过培训,以最大限度地利用输入信号和输出输出特性之间的相互信息。计算出的VQPC特征是为了将两个渠道合并成单一声道信号,然后通过神经网络进行参数的预测。本文中考虑了两种预测功能。用异位感波测波测的信号都经过模拟,甚至以不同类型模拟的预测编码编码编码进行测试。在模拟性信号的模拟性信号上,采用异质和正态的实地测度数据是真实性、真实性数据。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

机器学习组合优化

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

【AAAI2021】层次图胶囊网络

【AAAI2021】层次图胶囊网络

专知会员服务

84+阅读 · 2020年12月18日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【图神经网络遇上符号计算】Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

【图神经网络遇上符号计算】Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

专知会员服务

44+阅读 · 2020年3月3日

【CAAI 2019】基于知识智能的机器人技能学习，清华大学|孙富春

【CAAI 2019】基于知识智能的机器人技能学习，清华大学|孙富春

专知会员服务

43+阅读 · 2019年12月1日

【CIKM 2019论文】重力启发式图自编码器定向链路预测（Gravity-Inspired Graph Autoencoders for Directed Link Prediction），Guillaume Salha，Stratis Limnios

【CIKM 2019论文】重力启发式图自编码器定向链路预测（Gravity-Inspired Graph Autoencoders for Directed Link Prediction），Guillaume Salha，Stratis Limnios

专知会员服务

28+阅读 · 2019年11月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人工智能 | ICAPS 2019等国际会议信息3条

人工智能 | ICAPS 2019等国际会议信息3条

Call4Papers

3+阅读 · 2018年9月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

人工智能 | AAAI 2019等国际会议信息7条

人工智能 | AAAI 2019等国际会议信息7条

Call4Papers

5+阅读 · 2018年9月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Lossy Compression for Lossless Prediction

Arxiv

0+阅读 · 2022年1月28日

A two-step backward compatible fullband speech enhancement system

A two-step backward compatible fullband speech enhancement system

Arxiv

0+阅读 · 2022年1月27日

Stock2Vec: An Embedding to Improve Predictive Models for Companies

Arxiv

0+阅读 · 2022年1月27日

Neural Representations for Modeling Variation in Speech

Arxiv

0+阅读 · 2022年1月26日

Explicit construction of the minimum error variance estimator for stochastic LTI state-space systems

Arxiv

0+阅读 · 2022年1月26日

Learning Discriminative Model Prediction for Tracking

Learning Discriminative Model Prediction for Tracking

Arxiv

6+阅读 · 2019年4月15日

Representation Learning with Contrastive Predictive Coding

Arxiv

6+阅读 · 2019年1月22日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Deep CTR Prediction in Display Advertising

Arxiv

4+阅读 · 2016年9月20日

VIP会员

文章信息

相关主题

相关VIP内容

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

机器学习组合优化

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

【AAAI2021】层次图胶囊网络

【AAAI2021】层次图胶囊网络

专知会员服务

84+阅读 · 2020年12月18日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【图神经网络遇上符号计算】Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

【图神经网络遇上符号计算】Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

专知会员服务

44+阅读 · 2020年3月3日

【CAAI 2019】基于知识智能的机器人技能学习，清华大学|孙富春

【CAAI 2019】基于知识智能的机器人技能学习，清华大学|孙富春

专知会员服务

43+阅读 · 2019年12月1日

【CIKM 2019论文】重力启发式图自编码器定向链路预测（Gravity-Inspired Graph Autoencoders for Directed Link Prediction），Guillaume Salha，Stratis Limnios

【CIKM 2019论文】重力启发式图自编码器定向链路预测（Gravity-Inspired Graph Autoencoders for Directed Link Prediction），Guillaume Salha，Stratis Limnios

专知会员服务

28+阅读 · 2019年11月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

《大型语言模型能否有效生成基于博弈论的网络安全场景？》

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人工智能 | ICAPS 2019等国际会议信息3条

人工智能 | ICAPS 2019等国际会议信息3条

Call4Papers

3+阅读 · 2018年9月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

人工智能 | AAAI 2019等国际会议信息7条

人工智能 | AAAI 2019等国际会议信息7条

Call4Papers

5+阅读 · 2018年9月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Lossy Compression for Lossless Prediction

Arxiv

0+阅读 · 2022年1月28日

A two-step backward compatible fullband speech enhancement system

A two-step backward compatible fullband speech enhancement system

Arxiv

0+阅读 · 2022年1月27日

Stock2Vec: An Embedding to Improve Predictive Models for Companies

Arxiv

0+阅读 · 2022年1月27日

Neural Representations for Modeling Variation in Speech

Arxiv

0+阅读 · 2022年1月26日

Explicit construction of the minimum error variance estimator for stochastic LTI state-space systems

Arxiv

0+阅读 · 2022年1月26日

Learning Discriminative Model Prediction for Tracking

Learning Discriminative Model Prediction for Tracking

Arxiv

6+阅读 · 2019年4月15日

Representation Learning with Contrastive Predictive Coding

Arxiv

6+阅读 · 2019年1月22日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Deep CTR Prediction in Display Advertising

Arxiv

4+阅读 · 2016年9月20日

微信扫码咨询专知VIP会员