自我监督的音频模式有效解释人类对言语的文体反应 (Self-supervised models of audio effectively explain human cortical responses to speech)

Self-supervised language models are very effective at predicting high-level cortical responses during language comprehension. However, the best current models of lower-level auditory processing in the human brain rely on either hand-constructed acoustic filters or representations from supervised audio neural networks. In this work, we capitalize on the progress of self-supervised speech representation learning (SSL) to create new state-of-the-art models of the human auditory system. Compared against acoustic baselines, phonemic features, and supervised models, representations from the middle layers of self-supervised models (APC, wav2vec, wav2vec 2.0, and HuBERT) consistently yield the best prediction performance for fMRI recordings within the auditory cortex (AC). Brain areas involved in low-level auditory processing exhibit a preference for earlier SSL model layers, whereas higher-level semantic areas prefer later layers. We show that these trends are due to the models' ability to encode information at multiple linguistic levels (acoustic, phonetic, and lexical) along their representation depth. Overall, these results show that self-supervised models effectively capture the hierarchy of information relevant to different stages of speech processing in human cortex.

翻译：自我监督的语言模型对于在语言理解期间预测高层次的听觉反应非常有效。然而,目前人类大脑中低层次的听觉处理的最佳模型依赖于手建音响过滤器或受监督的听觉网络的演示。在这项工作中,我们利用自我监督的语音演示学习(SSL)的进展来创建人类听觉系统的最新最先进的模型。与声学基线、语音特征和监督模型相比,与自我监督模型中层(ACC、Wav2vec、Wav2vec 2.0和HuBERT)的演示相比,目前最佳的低层次的听觉处理模型模型(SSL)的演示模型(SSL)始终能产生FMRI记录的最佳预测性能。在低层次的听觉处理中,大脑领域更倾向于早期的SSLS模型,而较高层次的语管区则更倾向于后层。我们表明,这些趋势是由于模型在多种语言层次(APC、WAV2V2V、WV2vec 2.0和HuBERT)的演示中层, 和HUBERT的中层的演示模型,始终都能产生最佳的预测性功能。这些结果显示人类语音结构结构结构的自我监督模型的自我监督,其相关阶段的自我监督模型的自我监督,其内部的自我监督模型可有效采集的模型可捕测测测测测测制为人类演制。这些模型。这些模型。这些模型显示。总体测。这些模型显示为人类演制到与人类语言层次的图像。结果的图像的模型有效测。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日