While there are many music datasets with emotion labels in the literature, they cannot be used for research on symbolic-domain music analysis or generation, as there are usually audio files only. In this paper, we present the EMOPIA (pronounced `yee-m\`{o}-pi-uh') dataset, a shared multi-modal (audio and MIDI) database focusing on perceived emotion in pop piano music, to facilitate research on various tasks related to music emotion. The dataset contains 1,087 music clips from 387 songs and clip-level emotion labels annotated by four dedicated annotators. Since the clips are not restricted to one clip per song, they can also be used for song-level analysis. We present the methodology for building the dataset, covering the song list curation, clip selection, and emotion annotation processes. Moreover, we prototype use cases on clip-level music emotion classification and emotion-based symbolic music generation by training and evaluating corresponding models using the dataset. The result demonstrates the potential of EMOPIA for being used in future exploration on piano emotion-related MIR tasks.
翻译:虽然文献中有许多带有情感标签的音乐数据集,但这些数据不能用于研究象征性音乐分析或制作,因为通常只有音频文件。在本文中,我们展示了EMODIA数据集,这是一个共享的多模式(音频和MIDI)数据库,侧重于流行钢琴音乐中感知的情感,以便利研究与音乐情感有关的各种任务。该数据集包含来自387个歌曲的1 087个音乐剪辑和由4个专门注解员附加说明的剪辑的剪辑。由于剪辑不限于每首歌一个剪辑,这些剪辑也可以用于歌曲层面的分析。我们介绍了建立数据集的方法,涵盖歌曲列表的调理、剪辑选择和情感批注过程。此外,我们通过培训和评价使用数据集的相应模型,对剪辑级音乐情绪分类和情感制成的象征性音乐进行原型研究。结果表明,EOPIA有可能在今后探索与钢琴情感磁感相关的任务时使用。