Machine learning models of music typically break up the task of composition into a chronological process, composing a piece of music in a single pass from beginning to end. On the contrary, human composers write music in a nonlinear fashion, scribbling motifs here and there, often revisiting choices previously made. In order to better approximate this process, we train a convolutional neural network to complete partial musical scores, and explore the use of blocked Gibbs sampling as an analogue to rewriting. Neither the model nor the generative procedure are tied to a particular causal direction of composition. Our model is an instance of orderless NADE (Uria et al., 2014), which allows more direct ancestral sampling. However, we find that Gibbs sampling greatly improves sample quality, which we demonstrate to be due to some conditional distributions being poorly modeled. Moreover, we show that even the cheap approximate blocked Gibbs procedure from Yao et al. (2014) yields better samples than ancestral sampling, based on both log-likelihood and human evaluation.
翻译:音乐的机器学习模式通常将构成任务分成一个按时间顺序排列的过程,从开始到结束,在单一的关卡中组成一个音乐片段。相反,人类作曲家以非线性的方式写音乐,在此处和那里,经常重新审视以前所作的选择。为了更接近这一过程,我们训练一个革命性神经网络,以完成部分音乐分数,并探索使用被封住的Gibbs取样作为重写的类比。无论是模型还是基因化程序,都没有与特定的因果组合方向挂钩。我们的模型是无序的NADE(Uria等人,2014年)的例子,它允许更直接的祖传抽样。然而,我们发现Gibbs取样极大地改进了样本质量,我们证明这是由于一些有条件的分布没有很好地建模。此外,我们证明即使廉价的近似点阻碍了Yao等人(2014年)的Gibbbs程序,也比祖传的取样方法都好过于原样和人类评估。