音乐结构分析：基于图表示和变点检测方法的符号音乐结构分析 (Symbolic Music Structure Analysis with Graph Representations and Changepoint Detection Methods)

Music Structure Analysis is an open research task in Music Information Retrieval (MIR). In the past, there have been several works that attempt to segment music into the audio and symbolic domains, however, the identification and segmentation of the music structure at different levels is still an open research problem in this area. In this work we propose three methods, two of which are novel graph-based algorithms that aim to segment symbolic music by its form or structure: Norm, G-PELT and G-Window. We performed an ablation study with two public datasets that have different forms or structures in order to compare such methods varying their parameter values and comparing the performance against different music styles. We have found that encoding symbolic music with graph representations and computing the novelty of Adjacency Matrices obtained from graphs represent the structure of symbolic music pieces well without the need to extract features from it. We are able to detect the boundaries with an online unsupervised changepoint detection method with a F_1 of 0.5640 for a 1 bar tolerance in one of the public datasets that we used for testing our methods. We also provide the performance results of the algorithms at different levels of structure, high, medium and low, to show how the parameters of the proposed methods have to be adjusted depending on the level. We added the best performing method with its parameters for each structure level to musicaiz, an open source python package, to facilitate the reproducibility and usability of this work. We hope that this methods could be used to improve other MIR tasks such as music generation with structure, music classification or key changes detection.

翻译：音乐结构分析是音乐信息检索（MIR）中的一个研究任务。在过去的研究中，有几个尝试在音频和符号领域中分割音乐的工作，然而，在不同层次上识别和分割音乐结构仍然是这一领域的一个开放性研究问题。在这项工作中，我们提出了三种方法，其中两种是新颖的基于图的算法，旨在通过其形式或结构对符号音乐进行分割：Norm、G-PELT和G-Window。我们在两个具有不同形式或结构的公共数据集上进行了消融研究，以比较这些方法，改变它们的参数值并与不同的音乐风格的表现进行比较。我们发现，将符号音乐编码为图表示，并计算从图中得到的邻接矩阵的新奇性可以很好地表示符号音乐的结构，无需从中提取特征。我们能够使用在线无监督变点检测方法检测边界，在我们用于测试我们的方法的一个公共数据集中，在1个节拍公差下F_1为0.5640。我们还提供了算法在不同结构水平（高、中、低）的表现结果，以展示所提出方法的参数如何根据不同级别进行调整。我们将表现最佳的每个结构级别的方法及其参数添加到musicaiz，一个开源的python软件包中，以便于这项工作的再现性和可用性。我们希望这些方法可以用于改善其他MIR任务，如有结构的音乐生成，音乐分类或变调检测。