Music comprises of a set of complex simultaneous events organized in time. In this paper we introduce a novel framework that we call Deep Musical Information Dynamics, which combines two parallel streams - a low rate latent representation stream that is assumed to capture the dynamics of a thought process contrasted with a higher rate information dynamics derived from the musical data itself. Motivated by rate-distortion theories of human cognition we propose a framework for exploring possible relations between imaginary anticipations existing in the listener's mind and information dynamics of the musical surface itself. This model is demonstrated for the case of symbolic (MIDI) data, as accounting for acoustic surface would require many more layers to capture instrument properties and performance expressive inflections. The mathematical framework is based on variational encoding that first establishes a high rate representation of the musical observations, which is then reduced using a bit-allocation method into a parallel low rate data stream. The combined loss considered here includes both the information rate in terms of time evolution for each stream, and the fidelity of encoding measured in terms of mutual information between the high and low rate representations. In the simulations presented in the paper we are able to juxtapose aspects of latent/imaginary surprisal versus surprisal of the music surface in a manner that is quantifiable and computationally tractable. The set of computational tools is discussed in the paper, suggesting that a trade off between compression and prediction are an important factor in the analysis and design of time-based music generative models.
翻译:在本文中,我们引入了一个新的框架, 我们称之为深音乐信息动态( Deep Musical Information Advications), 它将两个平行流 — — 一个低速潜在代表流, 假设它能捕捉与音乐数据本身产生的更高速度信息动态形成对比的思维过程的动态。 我们以人类认知流的速率扭曲理论为动力, 提出了一个框架, 探索听众心智中存在的假想和音乐表面本身的信息动态之间可能存在的关系。 这个模型用于符号( MIDI) 数据, 因为计算声学表面需要更多层次来捕捉仪器属性和性能表达偏差。 数学框架基于变异编码, 首次建立音乐观测的高速度代表, 然后通过比位分配方法将其降低为平行的低速度数据流。 这里所考虑的综合损失包括每流时间变化的信息率, 以及高和低速比例演示的相互信息的准确度编码。 在纸质表面的模拟中, 我们所展示的图像和直观分析中, 一个可理解的表面和直观的图像分析是我们所分析的表面和直观的表面分析, 一个可以推地分析。