Music can be represented in multiple forms, such as in the audio form as a recording of a performance, in the symbolic form as a computer readable score, or in the image form as a scan of the sheet music. Music synchronisation provides a way to navigate among multiple representations of music in a unified manner by generating an accurate mapping between them, lending itself applicable to a myriad of domains like music education, performance analysis, automatic accompaniment and music editing. Traditional synchronisation methods compute alignment using knowledge-driven and stochastic approaches, typically employing handcrafted features. These methods are often unable to generalise well to different instruments, acoustic environments and recording conditions, and normally assume complete structural agreement between the performances and the scores. This PhD furthers the development of performance-score synchronisation research by proposing data-driven, context-aware alignment approaches, on three fronts: Firstly, I replace the handcrafted features by employing a metric learning based approach that is adaptable to different acoustic settings and performs well in data-scarce conditions. Secondly, I address the handling of structural differences between the performances and scores, which is a common limitation of standard alignment methods. Finally, I eschew the reliance on both feature engineering and dynamic programming, and propose a completely data-driven synchronisation method that computes alignments using a neural framework, whilst also being robust to structural differences between the performances and scores.
翻译:音乐同步化提供了一种方法,通过在音乐之间绘制准确的绘图,从而以统一的方式在音乐的多个表达形式中进行导航。 它适用于音乐教育、性能分析、自动配合和音乐编辑等众多领域。传统的同步化方法通常使用手工制作的功能,用知识驱动和随机化的方法计算对齐。这些方法往往无法很好地概括不同的仪器、声音环境和记录条件,通常无法在成绩和分数之间达成完整的结构协议。本博士在三条战线上提出数据驱动、环境认知调整和音乐编辑等多种领域,从而进一步发展性能核心同步化研究:首先,我用基于计量的学习方法取代手工制作的特征,该方法可适应不同的声学环境,在数据侵蚀条件下运行良好。第二,我处理对性能和分数之间的结构差异的处理问题,这是在业绩和分数之间形成完整的结构一致性调整,这是对标准性能调整方法的一种共同限制。最后,我用一种动态调整和同步化的方法,同时提出一个共同的同步化的方法。