Studying under-represented music traditions under the MIR scope is crucial, not only for developing novel analysis tools, but also for unveiling musical functions that might prove useful in studying world musics. This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data. The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre, among others. The content has been collected from a Greek documentary series that is available online, where academics present music traditions of Greece with live music and dance performance during the show, along with discussions about social, cultural and musicological aspects of the presented music. Therefore, this procedure has resulted in a significant wealth of descriptions regarding a variety of aspects, such as musical genre, places of origin and musical instruments. In addition, the audio recordings were performed under strict production-level specifications, in terms of recording equipment, leading to very clean and homogeneous audio content. In this work, apart from presenting the dataset in detail, we propose a baseline deep-learning classification approach to recognize the involved musicological attributes. The dataset, the baseline classification methods and the models are provided in public repositories. Future directions for further refining the dataset are also discussed.
翻译:在MIR范围内研究代表性不足的音乐传统至关重要,不仅对开发新分析工具至关重要,而且对展示可能有助于研究世界音乐的音乐功能也至关重要。本文介绍了希腊传统和民间音乐的数据集,其中包括1 570个片段,在大约80小时的时间内翻滚数据。数据集包含YouTube时间戳印链接,用于检索音像制品,以及与仪表、地理和流传等有关的丰富的元数据信息。内容来自希腊一个在线文件系列,在该系列中,学者们展示希腊音乐传统,展示表演期间的音乐和舞蹈表演现场表演,同时讨论所展示的音乐的社会、文化和音乐方面。因此,这一程序产生了大量关于各个方面的描述,如音乐流派、起源地和乐器等。此外,录音是在严格的生产规格下进行的,记录设备导致非常清洁和同质的音频内容。在这项工作中,除了详细介绍数据集外,我们还提议采用基线深学习的分类方法,以识别所涉音乐资料库中的社会、文化和音乐资料库。此外,还进一步讨论了数据库的改进方法。