The Guzheng is a kind of traditional Chinese instruments with diverse playing techniques. Instrument playing techniques (IPT) play an important role in musical performance. However, most of the existing works for IPT detection show low efficiency for variable-length audio and provide no assurance in the generalization as they rely on a single sound bank for training and testing. In this study, we propose an end-to-end Guzheng playing technique detection system using Fully Convolutional Networks that can be applied to variable-length audio. Because each Guzheng playing technique is applied to a note, a dedicated onset detector is trained to divide an audio into several notes and its predictions are fused with frame-wise IPT predictions. During fusion, we add the IPT predictions frame by frame inside each note and get the IPT with the highest probability within each note as the final output of that note. We create a new dataset named GZ_IsoTech from multiple sound banks and real-world recordings for Guzheng performance analysis. Our approach achieves 87.97% in frame-level accuracy and 80.76% in note-level F1-score, outperforming existing works by a large margin, which indicates the effectiveness of our proposed method in IPT detection.
翻译:Guzheng是一种具有多种游戏技巧的中国传统乐器。乐器游戏技术(IPT)在音乐表演中起着重要作用。然而,现有的IPT探测工作大多显示,多长音频低效率,无法保证一般化,因为它们依赖单一的音频库进行培训和测试。在本研究中,我们建议使用全成变幻网络来端到端的Guzheng游戏技术探测系统,可应用于多长音频。由于每一种古正游戏技术都应用在音符上,因此专门测试器将音频分为若干音频,其预测与框架性IPT预测相结合。在混合过程中,我们按每个音符内的框架添加IPT预测框架,并在每个音符内以最高概率获得IPT作为最后输出。我们创建了一套新的数据集,名为G ⁇ IsoTech,它来自多个音频银行,真实世界的音响分析。我们的方法达到87.97%的框架级精确度和80.76%的注级F1核心的预测与框架预测相结合的预测相结合。在混合过程中,我们按每个音符内按框架框架框架框架框架将现有测算法显示现有工作的效果。