Artist similarity plays an important role in organizing, understanding, and subsequently, facilitating discovery in large collections of music. In this paper, we present a hybrid approach to computing similarity between artists using graph neural networks trained with triplet loss. The novelty of using a graph neural network architecture is to combine the topology of a graph of artist connections with content features to embed artists into a vector space that encodes similarity. To evaluate the proposed method, we compile the new OLGA dataset, which contains artist similarities from AllMusic, together with content features from AcousticBrainz. With 17,673 artists, this is the largest academic artist similarity dataset that includes content-based features to date. Moreover, we also showcase the scalability of our approach by experimenting with a much larger proprietary dataset. Results show the superiority of the proposed approach over current state-of-the-art methods for music similarity. Finally, we hope that the OLGA dataset will facilitate research on data-driven models for artist similarity.
翻译:艺术家的相似性在组织、理解和随后促进大量音乐收藏中发现大量音乐方面起着重要作用。 在本文中,我们展示了一种混合方法,用经过三重损失训练的图形神经网络计算艺术家之间的相似性。使用图形神经网络结构的新颖之处是将艺术家与内容特征的图表连接的地形学与内容特征结合起来,将艺术家嵌入一个将相似性编码的矢量空间。为了评估拟议的方法,我们汇编了新的OLGA数据集,其中包含来自All Mours的艺术家相似性,以及来自AcoucticBrainz的内容特征。有17 673名艺术家,这是最大的学术艺术家相似性数据集,包括迄今基于内容的特征。此外,我们还展示了我们的方法的可扩展性,办法是试验一个大得多的专有性数据集。结果显示拟议方法优于当前音乐相似性的最新方法。最后,我们希望OLGA数据集将便利对艺术家以数据驱动模式的研究。