It has been established that Speech Affect Recognition for low resource languages is a difficult task. Here we present a Transfer learning based Speech Affect Recognition approach in which: we pre-train a model for high resource language affect recognition task and fine tune the parameters for low resource language using Deep Residual Network. Here we use standard four data sets to demonstrate that transfer learning can solve the problem of data scarcity for Affect Recognition task. We demonstrate that our approach is efficient by achieving 74.7 percent UAR on RAVDESS as source and Urdu data set as a target. Through an ablation study, we have identified that pre-trained model adds most of the features information, improvement in results and solves less data issues. Using this knowledge, we have also experimented on SAVEE and EMO-DB data set by setting Urdu as target language where only 400 utterances of data is available. This approach achieves high Unweighted Average Recall (UAR) when compared with existing algorithms.
翻译:已经确定低资源语言的语音影响承认是一项艰巨的任务。 我们在这里展示了基于传输学习的语音影响承认方法, 其中包括: 我们预先培训高资源语言模式影响识别任务, 并使用深残余网络微调低资源语言参数。 我们在这里使用标准四套数据集来证明, 传输学习可以解决低资源语言识别任务的数据稀缺问题。 我们证明我们的方法效率高, 将REAVDESS作为源代码实现了74.7%的 UAR, 将乌尔都作为目标数据集。 我们通过一项通缩研究发现, 预先培训的模型增加了大部分特征信息, 改进了结果, 并解决了较少的数据问题。 利用这一知识,我们还试验了 SAVEE 和 EMO-DB 数据集, 将Urdu 设定为目标语言, 在那里只有400种数据可供使用。 这种方法在与现有算法相比, 达到了高未加权平均调用量( UAR ) 。