Despite the recent increase in research on artificial intelligence for music, prominent correlations between key components of lyrics and rhythm such as keywords, stressed syllables, and strong beats are not frequently studied. This is likely due to challenges such as audio misalignment, inaccuracies in syllabic identification, and most importantly, the need for cross-disciplinary knowledge. To address this lack of research, we propose a novel multimodal lyrics-rhythm matching approach in this paper that specifically matches key components of lyrics and music with each other without any language limitations. We use audio instead of sheet music with readily available metadata, which creates more challenges yet increases the application flexibility of our method. Furthermore, our approach creatively generates several patterns involving various multimodalities, including music strong beats, lyrical syllables, auditory changes in a singer's pronunciation, and especially lyrical keywords, which are utilized for matching key lyrical elements with key rhythmic elements. This advantageous approach not only provides a unique way to study auditory lyrics-rhythm correlations including efficient rhythm-based audio alignment algorithms, but also bridges computational linguistics with music as well as music cognition. Our experimental results reveal an 0.81 probability of matching on average, and around 30% of the songs have a probability of 0.9 or higher of keywords landing on strong beats, including 12% of the songs with a perfect landing. Also, the similarity metrics are used to evaluate the correlation between lyrics and rhythm. It shows that nearly 50% of the songs have 0.70 similarity or higher. In conclusion, our approach contributes significantly to the lyrics-rhythm relationship by computationally unveiling insightful correlations.
翻译:尽管对音乐人工智能的研究最近有所增长,但歌词和节奏的关键组成部分(如关键词)之间有着显著的关联性,例如关键词、强调音调、强调音节和强节奏等关键组成部分之间却没有经常研究。这很可能是因为音频不匹配、音频识别不准确、最重要的是跨学科知识需要等挑战。为了解决这种缺乏研究的问题,我们建议在本文件中采用新颖的多式歌词-节奏匹配方法,该方法特别将歌词和音乐的关键组成部分无任何语言限制地匹配。我们用随时可用的元数据取代歌词和音乐表音乐,这带来了更多的挑战,提高了我们方法的应用灵活性。此外,我们的方法创造性地产生了一些涉及多种多式联运的模式,包括音乐强音节拍、听音频识别、歌词读音变化,特别是用于将关键调调调调元素与关键节奏元素相匹配的词汇。这种有利的方法不仅提供了一种独特的方法来研究歌词-曲调相关性相关性,而且提高了我们方法的相对性关系。此外,我们所用的50度-直径直径直调关系,包括节律-直径直径直径直径直径直径直的音直径直径直径直的音关系,也是我们用于12度平均的音调的音-直径直径直调的音调调的音调的音调的音调。</s>