HPPNet:模拟钢琴剪接过程中的调和结构与切片差异 (HPPNet: Modeling the Harmonic Structure and Pitch Invariance in Piano Transcription)

While neural network models are making significant progress in piano transcription, they are becoming more resource-consuming due to requiring larger model size and more computing power. In this paper, we attempt to apply more prior about piano to reduce model size and improve the transcription performance. The sound of a piano note contains various overtones, and the pitch of a key does not change over time. To make full use of such latent information, we propose HPPNet that using the Harmonic Dilated Convolution to capture the harmonic structures and the Frequency Grouped Recurrent Neural Network to model the pitch-invariance over time. Experimental results on the MAESTRO dataset show that our piano transcription system achieves state-of-the-art performance both in frame and note scores (frame F1 93.15%, note F1 97.18%). Moreover, the model size is much smaller than the previous state-of-the-art deep learning models.

翻译：虽然神经网络模型在钢琴转录方面正在取得重大进展,但由于需要更大的模型尺寸和更多的计算能力,这些模型正变得越来越耗资资源。在本文中,我们试图在更先应用钢琴来减少模型尺寸并改进转录性能。钢琴音响包含不同的外观, 键的音调不会随时间而改变。为了充分利用这些潜伏信息, 我们提议 HPPNet 使用调和解调变动来捕捉和谐结构, 并使用频率组合常规神经网络来模拟时空投影变异。 MAESTRO 数据集的实验结果显示, 我们的钢琴转录系统在框架和注分上都取得了最先进的性能( F113. 15%, 注F1 97.18% ) 。此外, 模型大小远小于以前最先进的深层学习模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日