音乐、歌词和音乐音频之间联合代表学习的MTM数据集? (MTM Dataset for Joint Representation Learning among Sheet Music, Lyrics, and Musical Audio?)

We introduce the music Ternary Modalities Dataset (MTM Dataset), which is created by our group to learn joint representations among music three modalities in music information retrieval (MIR), including three types of cross-modal retrieval. Learning joint representations for cross-modal retrieval among three modalities has been limited because of the limited availability of large dataset including three or more modalities. The goal of MTM Dataset collection is to overcome the constraints by extending music notes to sheet music and music audio, and build music-note and syllable fine grained alignment, such that the dataset can be used to learn joint representation across multimodal music data. The MTM Dataset provides three modalities: sheet music, lyrics and music audio and their feature extracted by pre-trained models. In this paper, we describe the dataset and how it was built, and evaluate some baselines for cross-modal retrieval tasks. The dataset and usage examples are available at https://github.com/MorningBooks/MTM-Dataset.

翻译：我们采用了音乐田间模式数据集(MTM Dataset),这是由我们小组创建的,目的是学习音乐信息检索(MIR)中三种音乐模式的联合代表,包括三种类型的跨模式检索;学习三种模式的跨模式检索联合代表有限,因为大型数据集有限,包括三种或三种以上模式;MTM Datas收集的目的是通过将音乐笔记扩大到音乐和音乐音频表,以及建立音乐笔记和音调的细微配对,克服制约因素,使数据集可用于学习多种音乐数据的联合代表。MTM Dataset提供了三种模式:单张音乐、歌词和音乐音频及其通过预先培训模式提取的特征。在本文件中,我们描述了数据集及其如何构建,并评估交叉模式检索任务的一些基线。数据集和使用实例见https://github.com/MNMTM-Dataset。

相关内容

数据集

关注 0

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日