In this article, we aim to provide a review of the key ideas and approaches proposed in 20 years of scientific literature around musical version identification (VI) research and connect them to current practice. For more than a decade, VI systems suffered from the accuracy-scalability trade-off, with attempts to increase accuracy that typically resulted in cumbersome, non-scalable systems. Recent years, however, have witnessed the rise of deep learning-based approaches that take a step toward bridging the accuracy-scalability gap, yielding systems that can realistically be deployed in industrial applications. Although this trend positively influences the number of researchers and institutions working on VI, it may also result in obscuring the literature before the deep learning era. To appreciate two decades of novel ideas in VI research and to facilitate building better systems, we now review some of the successful concepts and applications proposed in the literature and study their evolution throughout the years.
翻译:在本条中,我们力求审查20年来围绕音乐版本识别(VI)研究的科学文献提出的关键思想和方法,并将之与目前的做法联系起来。十多年来,六系统受到精确可扩缩权衡的影响,试图提高精确度,通常导致繁琐、不可扩缩的系统。然而,近年来,以深层次学习为基础的方法的兴起迈出了一步,缩小了精确可扩缩差距,产生了可实际用于工业应用的系统。虽然这一趋势积极影响研究六的研究人员和机构的数目,但也可能在深层次学习时代之前就忽略了文献。为了了解六系统研究的二十年新思想,并促进建立更好的系统,我们现在审查文献中的一些成功的概念和应用,并研究这些年的演变过程。