MCLP 空间通信分析和重建 (Directional MCLP Analysis and Reconstruction for Spatial Speech Communication)

Spatial speech communication, i.e., the reconstruction of spoken signal along with the relative speaker position in the enclosure (reverberation information) is considered in this paper. Directional, diffuse components and the source position information are estimated at the transmitter, and perceptually effective reproduction is considered at the receiver. We consider spatially distributed microphone arrays for signal acquisition, and node specific signal estimation, along with its direction of arrival (DoA) estimation. Short-time Fourier transform (STFT) domain multi-channel linear prediction (MCLP) approach is used to model the diffuse component and relative acoustic transfer function is used to model the direct signal component. Distortion-less array response constraint and the time-varying complex Gaussian source model are used in the joint estimation of source DoA and the constituent signal components, separately at each node. The intersection between DoA directions at each node is used to compute the source position. Signal components computed at the node nearest to the estimated source position are taken as the signals for transmission. At the receiver, a four channel loud speaker (LS) setup is used for spatial reproduction, in which the source spatial image is reproduced relative to a chosen virtual listener position in the transmitter enclosure. Vector base amplitude panning (VBAP) method is used for direct component reproduction using the LS setup and the diffuse component is reproduced equally from all the loud speakers after decorrelation. This scheme of spatial speech communication is shown to be effective and more natural for hands-free telecommunication, through either loudspeaker listening or binaural headphone listening with head related transfer function (HRTF) based presentation.

翻译：空间语音通信, 也就是说, 本文将考虑对语音信号的重建, 以及附文( 变频信息) 中的相对语音位置( 变频信息) 。发送器对方向、扩散组件和源位置信息进行估算, 接收器则考虑感知有效的复制。我们考虑空间分布的麦克风阵列, 用于获取信号, 和节点特定的信号估计, 连同其到达方向( DoA) 估计。使用短时间 Fourier 变换( STFT) 域域域多频道线性预测( MCLLP) 方法来模拟扩散组件, 并使用相对声道传输功能来模拟直接信号组件。在对源和构件进行联合估计时变换的 Gaussian 源模式中, 使用调频流的调频阵列响应限制和时间变化复合复合集, 使用直流式图像复制机的平流式结构。以空间源方向的交错路段为直传工具。在图像复制后, 直传式复制机的平流图像结构中, 以直传至直传至直传式。直传至直传工具为直传式。

相关内容

Speech Com

关注 0

Speech Communication是一门跨学科期刊，其主要目标是满足快速传播和彻底讨论基础研究和应用研究结果的需求。为了建立框架以相互关联本领域各个领域的结果，将重点放在跨学科性质的观点和主题上。官网地址：http://dblp.uni-trier.de/db/journals/speech/

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

近期必读的5篇顶会CVPR 2021【行为识别】相关论文和代码

专知会员服务

60+阅读 · 2021年3月17日

【经典书】精通Linux，394页pdf

专知会员服务

97+阅读 · 2021年2月19日

耶鲁大学《分布式系统理论》笔记，491页pdf

专知会员服务

46+阅读 · 2020年7月29日