Along with the evolution of music technology, a large number of styles, or "subgenres," of Electronic Dance Music(EDM) have emerged in recent years. While the classification task of distinguishing between EDM and non-EDM has been often studied in the context of music genre classification, little work has been done on the more challenging EDM subgenre classification. The state-of-art model is based on extremely randomized trees and could be improved by deep learning methods. In this paper, we extend the state-of-art music auto-tagging model "short-chunkCNN+Resnet" to EDM subgenre classification, with the addition of two mid-level tempo-related feature representations, called the Fourier tempogram and autocorrelation tempogram. And, we explore two fusion strategies, early fusion and late fusion, to aggregate the two types of tempograms. We evaluate the proposed models using a large dataset consisting of 75,000 songs for 30 different EDM subgenres, and show that the adoption of deep learning models and tempo features indeed leads to higher classification accuracy.
翻译:随着音乐技术的演进,近年来出现了大量电子舞蹈音乐(EDM)的风格或“子基因”,即电子舞蹈音乐(EDM)的许多“次基因”,虽然经常在音乐基因分类方面研究区分EDM和非EDM的分类任务,但在更具挑战性的EDM子基因分类方面没有做多少工作。最先进的模型以极随机化的树木为基础,可以通过深层学习方法加以改进。在本文中,我们将最先进的音乐自动调制模型“short-chunkCNN+Resnet”扩大到EDM子基因分类,并增加了两个与中等节奏有关的地貌表,称为四倍音图和自动电动脉冲图。我们还探讨了两种融合战略,即早期聚变和迟聚,以汇总两种类型的电动图。我们用一个由75 000首歌组成的大型数据集评估了30种EDMDM子的模型,并表明采用深级模型和节流特征确实导致更高的分类。