Although deep learning has revolutionized music generation, existing methods for structured melody generation follow an end-to-end left-to-right note-by-note generative paradigm and treat each note equally. Here, we present WuYun, a knowledge-enhanced deep learning architecture for improving the structure of generated melodies, which first generates the most structurally important notes to construct a melodic skeleton and subsequently infills it with dynamically decorative notes into a full-fledged melody. Specifically, we use music domain knowledge to extract melodic skeletons and employ sequence learning to reconstruct them, which serve as additional knowledge to provide auxiliary guidance for the melody generation process. We demonstrate that WuYun can generate melodies with better long-term structure and musicality and outperforms other state-of-the-art methods by 0.51 on average on all subjective evaluation metrics. Our study provides a multidisciplinary lens to design melodic hierarchical structures and bridge the gap between data-driven and knowledge-based approaches for numerous music generation tasks.
翻译:虽然深层次的学习使音乐的产生发生了革命性的变化,但结构化旋律生成的现有方法遵循端到端左对端的逐个注解模式,并平等地对待每个音符。在这里,我们展示了WuYun,这是改善生成旋律结构的强化知识的深层次学习结构,它首先产生结构上最重要的音符,用来构造旋律骨架,随后以动态装饰音符填充成完整的旋律。具体地说,我们利用音乐领域知识提取旋律骨架,并利用序列学习来重建这些曲目,作为补充知识,为旋律生成过程提供辅助性指导。我们证明,WuYun能够以更好的长期结构、音乐和超越其他最新艺术方法的方式产生旋律,平均0.51分在所有主观评价指标上产生。我们的研究提供了一个多学科透镜,用于设计多层次结构,并弥合数据驱动和知识基础方法之间在众多音乐生成任务上的差距。</s>