byteCover:通过多损失培训提供封面歌曲识别 (ByteCover: Cover Song Identification via Multi-Loss Training)

We present in this paper ByteCover, which is a new feature learning method for cover song identification (CSI). ByteCover is built based on the classical ResNet model, and two major improvements are designed to further enhance the capability of the model for CSI. In the first improvement, we introduce the integration of instance normalization (IN) and batch normalization (BN) to build IBN blocks, which are major components of our ResNet-IBN model. With the help of the IBN blocks, our CSI model can learn features that are invariant to the changes of musical attributes such as key, tempo, timbre and genre, while preserving the version information. In the second improvement, we employ the BNNeck method to allow a multi-loss training and encourage our method to jointly optimize a classification loss and a triplet loss, and by this means, the inter-class discrimination and intra-class compactness of cover songs, can be ensured at the same time. A set of experiments demonstrated the effectiveness and efficiency of ByteCover on multiple datasets, and in the Da-TACOS dataset, ByteCover outperformed the best competitive system by 20.9\%.

翻译：在本文中,我们提出了ByteCover,这是首首歌识别的一种新的特色学习方法。ByteCover是根据古典ResNet模型建立的,并设计了两项重大改进,以进一步加强首创歌曲识别模型的能力。在第一项改进中,我们采用了原样正常化和批次正常化整合,以建造IBN区块,这是我们ResNet-IBN模型的主要组成部分。在IMN区块的帮助下,我们的CSI模型可以学习音乐属性变化中变化不定的特点,如关键、节奏、节奏和精髓,同时保存版本信息。在第二项改进中,我们采用BNNNeck 法,允许进行多损耗培训,鼓励我们共同优化分类损失和三重损失的方法。通过这种方法,可以确保高档和高档歌曲的阶级间差别和本级内紧凑性。一系列实验表明,ByteCover对多种数据集和Da-TaCOS数据集具有效力和效率,ByteCoverperferto the best 系统通过20竞争力。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/