自我监督的演讲模式是否发展出人性化的观念偏见? (Do self-supervised speech models develop human-like perception biases?)

Self-supervised models for speech processing form representational spaces without using any external labels. Increasingly, they appear to be a feasible way of at least partially eliminating costly manual annotations, a problem of particular concern for low-resource languages. But what kind of representational spaces do these models construct? Human perception specializes to the sounds of listeners' native languages. Does the same thing happen in self-supervised models? We examine the representational spaces of three kinds of state-of-the-art self-supervised models: wav2vec 2.0, HuBERT and contrastive predictive coding (CPC), and compare them with the perceptual spaces of French-speaking and English-speaking human listeners, both globally and taking account of the behavioural differences between the two language groups. We show that the CPC model shows a small native language effect, but that wav2vec 2.0 and HuBERT seem to develop a universal speech perception space which is not language specific. A comparison against the predictions of supervised phone recognisers suggests that all three self-supervised models capture relatively fine-grained perceptual phenomena, while supervised models are better at capturing coarser, phone-level, effects of listeners' native language, on perception.

翻译：在不使用任何外部标签的情况下,语音处理代表空间的自我监督模式不使用任何外部标签。越来越多的是,它们似乎是至少部分消除成本昂贵的人工说明的可行方法,这是低资源语言特别关注的一个问题。但是,这些模式可以构建何种代表空间? 人类的认知专门针对听众的母语声音。同样的事情是否发生在自我监督的模型中? 我们检查三种最先进的自我监督模式: wav2vec 2. 0、HuBERT 和对比性预测编码(CPC) 的代表性空间,并将其与讲法语和英语的人类听众的认知空间进行比较, 包括全球范围, 以及考虑到两种语言群体的行为差异。我们显示, CPC 模式显示了一种小的本地语言效应, 但是, wav2vec 2.0 和 HuBERT 似乎可以开发一种通用的语音认知空间, 它不是语言专用的。与受监督的电话识别器的预测相比, 所有三种自我监督的模型都能够捕捉取相对精细的视觉视觉现象, 而受监督的模型则能够更好地捕捉取对本地语言影响。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日