High quality datasets for learning-based modelling of polyphonic symbolic music remain less readily-accessible at scale than in other domains, such as language modelling or image classification. In particular, datasets which contain information revealing insights about human responses to the given music samples are rare. The issue of scale persists as a general hindrance towards breakthroughs in the field, while the lack of listener evaluation is especially relevant to the generative modelling problem-space, where clear objective metrics correlating strongly with qualitative success remain elusive. We propose the JS Fake Chorales, a dataset of 500 pieces generated by a new learning-based algorithm, provided in MIDI form. We take consecutive outputs from the algorithm and avoid cherry-picking in order to validate the potential to further scale this dataset on-demand. We conduct an online experiment for human evaluation, designed to be as fair to the listener as possible, and find that respondents were on average only 7% better than random guessing at distinguishing JS Fake Chorales from real chorales composed by JS Bach. Furthermore, we make anonymised data collected from experiments available along with the MIDI samples, such as the respondents' musical experience and how long they took to submit their response for each sample. Finally, we conduct ablation studies to demonstrate the effectiveness of using the synthetic pieces for research in polyphonic music modelling, and find that we can improve on state-of-the-art validation set loss for the canonical JSB Chorales dataset, using a known algorithm, by simply augmenting the training set with the JS Fake Chorales.
翻译:以学习为基础的多功能象征性音乐建模的高质量数据集仍然比语言建模或图像分类等其他领域更容易获得。 特别是,包含能揭示人类对特定音乐样品的反应的洞察力的数据集非常罕见。 规模问题作为在实地突破方面普遍的障碍依然存在,而缺乏听众评价对于基因化建模问题空间特别相关,因为与质优成功密切相关的明确客观指标仍然难以找到。 我们提议使用JS Fake Chorales,这是一个由新的基于学习的算法生成的500个数据集,以语言建模或图像分类形式提供。 我们从算法中获取连续产出,避免摘樱桃,以验证进一步根据需求扩大这一数据集的潜力。 我们进行一个在线人类评价实验,目的是尽可能公平地对待听众,发现受访者在将JSFake Chorales与JSBach的真恰罗拉莱斯(JSFake Chorales)区分时平均只有7 % 。 此外,我们从实验中收集了500个成份的数据,与MDIFIDI的算法相联。 我们从磁测算中获取了连续数据,避免摘取了樱剪剪剪剪, 以验证结果样本来进行我们最后用C的模拟研究。 我们通过磁标, 向测试 向实验展示了C, 展示的进度样样样样样样样,我们可以展示了每个的磁测测算,,我们通过磁测, 进行了磁测的磁测,可以展示了磁测,,可以向,我们向 展示的磁测测测测测测测测测测算的磁测测测测测测测测测 。