This paper models unsupervised learning of an identity-based pattern (or copying) in speech called reduplication from raw continuous data with deep convolutional neural networks. We use the ciwGAN architecture Begu\v{s} (2021a; arXiv:2006.02951) in which learning of meaningful representations in speech emerges from a requirement that the CNNs generate informative data. We propose a technique to wug-test CNNs trained on speech and, based on four generative tests, argue that the network learns to represent an identity-based pattern in its latent space. By manipulating only two categorical variables in the latent space, we can actively turn an unreduplicated form into a reduplicated form with no other substantial changes to the output in the majority of cases. We also argue that the network extends the identity-based pattern to unobserved data. Exploration of how meaningful representations of identity-based patterns emerge in CNNs and how the latent space variables outside of the training range correlate with identity-based patterns in the output has general implications for neural network interpretability.
翻译:这种纸型模型不受监督地从与深层进化神经网络的原始连续数据中学习了一种身份模式(或复制),其名称是:从与深层进化神经网络的原始连续数据中重新产生。我们使用ciwGAN结构 Begu\v{s} (2021a; arXiv: 2006.02951),在这种模式中,根据有线电视新闻网生成信息数据的要求,在语音数据中学习了有意义的言语表达方式。我们建议了一种技术,用语言对有线电视新闻网进行训练,并基于四个基因测试,认为网络学会在其潜在空间中代表一种身份模式。通过在潜伏空间中只操纵两个绝对变量,我们可以积极将一种非重复的形态转变为一种经过复制的形式,对大多数案例的输出结果不作其他实质性改动。我们还认为,网络将基于身份的模式扩展至无线电视网的数据。探索有多么有意义的身份模式的表述方式,以及培训范围以外的潜在空间变量如何与产出中基于身份的模式相关,对神经网络可解释性产生普遍的影响。