Version identification (VI) systems now offer accurate and scalable solutions for detecting different renditions of a musical composition, allowing the use of these systems in industrial applications and throughout the wider music ecosystem. Such use can have an important impact on various stakeholders regarding recognition and financial benefits, including how royalties are circulated for digital rights management. In this work, we take a step toward acknowledging this impact and consider VI systems as socio-technical systems rather than isolated technologies. We propose a framework for quantifying performance disparities across 5 systems and 6 relevant side attributes: gender, popularity, country, language, year, and prevalence. We also consider 3 main stakeholders for this particular information retrieval use case: the performing artists of query tracks, those of reference (original) tracks, and the composers. By categorizing the recordings in our dataset using such attributes and stakeholders, we analyze whether the considered VI systems show any implicit biases. We find signs of disparities in identification performance for most of the groups we include in our analyses. Moreover, we also find that learning- and rule-based systems behave differently for some attributes, which suggests an additional dimension to consider along with accuracy and scalability when evaluating VI systems. Lastly, we share our dataset with attribute annotations to encourage VI researchers to take these aspects into account while building new systems.
翻译:版本识别系统(VI)现在提供了准确和可扩展的解决方案,用于发现音乐成份的不同解说,允许在工业应用和整个音乐生态系统中使用这些系统,这种使用在承认和财政利益方面可以对各种利益攸关方产生重要影响,包括如何为数字权利管理分发使用费。在这项工作中,我们朝着承认这种影响迈出了一步,将VI系统视为社会技术系统,而不是孤立的技术。我们提出了一个框架,用以量化5个系统和6个相关侧属性(性别、流行程度、国家、语言、年份和流行程度)的绩效差异。我们还考虑这一信息检索使用案例的三个主要利益攸关方:查询行踪的表演艺术家、参考(原始)轨道的艺术家和合成者。通过利用这些属性和利益攸关方对数据集中的记录进行分类,我们分析考虑的VI系统是否显示出任何隐含的偏差。我们发现,我们分析中包括的大多数群体在识别性能方面存在差异。此外,我们还发现学习和基于规则的系统在某些属性上表现不同,这说明在评估VI系统时需要考虑另一个层面的准确性和可缩度,同时鼓励研究人员将这些特征纳入第六系统。