模拟 Stem 音轨的兼容性以生成音乐元件 (Modeling the Compatibility of Stem Tracks to Generate Music Mashups)

from arxiv, This is a preprint of the paper accepted by AAAI-21. Please cite the version included in the Proceedings of the 35th AAAI Conference on Artificial Intelligence

A music mashup combines audio elements from two or more songs to create a new work. To reduce the time and effort required to make them, researchers have developed algorithms that predict the compatibility of audio elements. Prior work has focused on mixing unaltered excerpts, but advances in source separation enable the creation of mashups from isolated stems (e.g., vocals, drums, bass, etc.). In this work, we take advantage of separated stems not just for creating mashups, but for training a model that predicts the mutual compatibility of groups of excerpts, using self-supervised and semi-supervised methods. Specifically, we first produce a random mashup creation pipeline that combines stem tracks obtained via source separation, with key and tempo automatically adjusted to match, since these are prerequisites for high-quality mashups. To train a model to predict compatibility, we use stem tracks obtained from the same song as positive examples, and random combinations of stems with key and/or tempo unadjusted as negative examples. To improve the model and use more data, we also train on "average" examples: random combinations with matching key and tempo, where we treat them as unlabeled data as their true compatibility is unknown. To determine whether the combined signal or the set of stem signals is more indicative of the quality of the result, we experiment on two model architectures and train them using semi-supervised learning technique. Finally, we conduct objective and subjective evaluations of the system, comparing them to a standard rule-based system.

翻译：音乐mashup 结合了来自两个或更多歌曲的音效元素以创建新工作。为了减少时间和努力, 研究人员开发了预测音效元素兼容性的算法。先前的工作重点是混合未经改变的节录, 但源分离的进步使得能够从孤立的源( 如声、鼓、低音等) 创建混音。在这项工作中, 我们利用分离的源代码不仅用于创建mashup, 也用于培训一种模型, 该模型预测各组节录的相互兼容性, 使用自我监督的和半监督的方法。具体地说, 我们首先开发一个随机的 Mashup 创建管道, 将通过源分离获得的条纹路径合并, 关键和节奏自动调整, 因为这些是高品质( 如声、鼓、低音等) 的预设模型。为了改进模型和使用更多数据, 我们还在“ 平均” 系统上培训一个随机的创建管道, 将轨迹连接成一个未知的路径。与关键和温度定的模型, 我们用两个结果, 我们用它们的模型的模型和的模型的随机的模型的模型来分析。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/