自动叙分录音模式的剩余关注网络 (Residual Attention Based Network for Automatic Classification of Phonation Modes)

Phonation mode is an essential characteristic of singing style as well as an important expression of performance. It can be classified into four categories, called neutral, breathy, pressed and flow. Previous studies used voice quality features and feature engineering for classification. While deep learning has achieved significant progress in other fields of music information retrieval (MIR), there are few attempts in the classification of phonation modes. In this study, a Residual Attention based network is proposed for automatic classification of phonation modes. The network consists of a convolutional network performing feature processing and a soft mask branch enabling the network focus on a specific area. In comparison experiments, the models with proposed network achieve better results in three of the four datasets than previous works, among which the highest classification accuracy is 94.58%, 2.29% higher than the baseline.

翻译：听觉模式是歌唱风格的一个基本特征,也是表演的一种重要表现,可以分为四类,称为中性、喘息、压抑和流动。以前的研究使用声音质量特征和特征工程进行分类。虽然在音乐信息检索(MIR)的其他领域已经取得了重大进步,但在对幻灯模式进行分类方面几乎没有什么尝试。在这项研究中,建议建立一个以剩余注意力为基础的网络,对幻灯模式进行自动分类。网络包括一个进行特征处理的革命性网络和一个软面罩分支,使网络以特定领域为重点。相比之下,在实验中,与拟议的网络模型相比,四个数据集中的三个取得了更好的结果,其中最高的分类精确度为94.58%,比基线高出2.29%。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR2021】多实例主动学习目标检测

专知会员服务

43+阅读 · 2021年4月18日

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

专知会员服务

63+阅读 · 2020年7月12日

【WWW 2019】异质图注意力网络，Heterogeneous Graph Attention Network

专知会员服务

75+阅读 · 2020年6月14日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日