通过深学习方法从多功能音乐中提取Melody Meldy:回顾 (Melody Extraction from Polyphonic Music by Deep Learning Approaches: A Review)

Melody extraction is a vital music information retrieval task among music researchers for its potential applications in education pedagogy and the music industry. Melody extraction is a notoriously challenging task due to the presence of background instruments. Also, often melodic source exhibits similar characteristics to that of the other instruments. The interfering background accompaniment with the vocals makes extracting the melody from the mixture signal much more challenging. Until recently, classical signal processing-based melody extraction methods were quite popular among melody extraction researchers. The ability of the deep learning models to model large-scale data and the ability of the models to learn automatic features by exploiting spatial and temporal dependencies inspired many researchers to adopt deep learning models for melody extraction. In this paper, an attempt has been made to review the up-to-date data-driven deep learning approaches for melody extraction from polyphonic music. The available deep models have been categorized based on the type of neural network used and the output representation they use for predicting melody. Further, the architectures of the 25 melody extraction models are briefly presented. The loss functions used to optimize the model parameters of the melody extraction models are broadly categorized into four categories and briefly describe the loss functions used by various melody extraction models. Also, the various input representations adopted by the melody extraction models and the parameter settings are deeply described. A section describing the explainability of the block-box melody extraction deep neural networks is included. The performance of 25 melody extraction methods is compared. The possible future directions to explore/improve the melody extraction methods are also presented in the paper.

翻译：在音乐研究人员中,Melody提取是一个至关重要的音乐信息检索任务。Melody提取由于有背景仪器的存在而是一项臭名昭著的挑战性任务。此外,经常的旋律源与其它仪器具有相似的特点。干扰背景的结合使得从混合物信号中提取旋律更具挑战性。直到最近,经典信号处理制旋律提取方法在旋律提取研究人员中非常受欢迎。深层次学习模型模拟大型数据的能力以及模型通过利用空间和时间依赖来学习自动特征的能力,激励许多研究人员采用深层学习模型进行旋律提取。在本文中,试图审查最新数据驱动的深层次学习方法,以便从混合曲调中提取旋律信号。根据使用的神经网络类型和它们用来预测旋律的输出表示方式,现有深层次模型的分类。此外,25调调调调调制模型的架构通过利用时间和时间依赖时间来进行简要比较,激励许多研究人员采用深层学习模型,采用深层次学习模型来采用深层次的数学模型。在模型中所使用的损失函数是模型的缩缩缩缩缩图解的缩缩缩缩图解的缩图解图解。此外,还使用了各种提取图解的缩图解的缩图解的缩缩缩图解的缩图解方法。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/