The main challenges of Optical Music Recognition (OMR) come from the nature of written music, its complexity and the difficulty of finding an appropriate data representation. This paper provides a first look at DoReMi, an OMR dataset that addresses these challenges, and a baseline object detection model to assess its utility. Researchers often approach OMR following a set of small stages, given that existing data often do not satisfy broader research. We examine the possibility of changing this tendency by presenting more metadata. Our approach complements existing research; hence DoReMi allows harmonisation with two existing datasets, DeepScores and MUSCIMA++. DoReMi was generated using a music notation software and includes over 6400 printed sheet music images with accompanying metadata useful in OMR research. Our dataset provides OMR metadata, MIDI, MEI, MusicXML and PNG files, each aiding a different stage of OMR. We obtain 64% mean average precision (mAP) in object detection using half of the data. Further work includes re-iterating through the creation process to satisfy custom OMR models. While we do not assume to have solved the main challenges in OMR, this dataset opens a new course of discussions that would ultimately aid that goal.
翻译:光学音乐认识(OMR)的主要挑战来自书面音乐的性质、其复杂性和找到适当数据代表的难度。本文件首先审视了DoRemi(一个应对这些挑战的OMR数据集)和基线物体探测模型,以评估其效用。研究人员经常在一系列小阶段后接近OMR,因为现有数据往往不能满足更广泛的研究。我们研究了通过提供更多的元数据改变这一趋势的可能性。我们的方法补充了现有的研究;因此DoReMi允许与两个现有数据集(深色和MUSCIMA+++)相协调。DoRemi(Doremi)是使用音乐标记软件生成的,包括6400多张印刷的音乐短片,并附有对OMR研究有用的元数据。我们的数据集提供OMR元数据、MIDI、MEI、MuscXML和PNG文件,每个文件都有助于不同阶段的OMR。我们在使用一半的数据进行对象探测时获得64%的平均平均精确度(MAP)。进一步的工作包括通过创建程序重新命名,以满足定制的OMR模型。我们并不认为最终解决OMR的主要挑战。