In the task of generating music, the art factor plays a big role and is a great challenge for AI. Previous work involving adversarial training to produce new music pieces and modeling the compatibility of variety in music (beats, tempo, musical stems) demonstrated great examples of learning this task. Though this was limited to generating mashups or learning features from tempo and key distributions to produce similar patterns. Compound Word Transformer was able to represent music generation task as a sequence generation challenge involving musical events defined by compound words. These musical events give a more accurate description of notes progression, chord change, harmony and the art factor. The objective of the project is to implement a Multi-Genre Transformer which learns to produce music pieces through more adaptive learning process involving more challenging task where genres or form of the composition is also considered. We built a multi-genre compound word dataset, implemented a linear transformer which was trained on this dataset. We call this Multi-Genre Transformer, which was able to generate full length new musical pieces which is diverse and comparable to original tracks. The model trains 2-5 times faster than other models discussed.
翻译:在制作音乐的任务中,艺术因素发挥了巨大的作用,是AI面临的一个巨大挑战。以前关于制作新音乐片的对抗性培训以及制作音乐(节奏、节奏、音乐节奏)多样性的模型的工作展示了学习这项任务的伟大例子。虽然这仅限于从节奏和关键分布中生成混集或学习特征,以产生类似的模式。复合Word变异器能够代表音乐生成任务,作为由复合词定义的音乐事件的一个序列生成挑战。这些音乐事件更准确地描述了笔记的进展、和弦变化、和谐和艺术因素。该项目的目标是实施多金色变异器,通过更具适应性的学习过程,学习制作音乐片,其中也考虑组成方式或形式等更具挑战性的任务。我们建立了一个多元组合词数据集,实施了一个在数据集上受过培训的线性变变变器。我们称之为多金色变变器,它能够产生全长度的新音乐片,与原始曲目相似。模型列速度比其他模型快2至5倍。