We proposed the industry level deep learning approach for speech emotion recognition task. In industry, carefully proposed deep transfer learning technology shows real results due to mostly low amount of training data availability, machine training cost, and specialized learning on dedicated AI tasks. The proposed speech recognition framework, called DeepEMO, consists of two main pipelines such that preprocessing to extract efficient main features and deep transfer learning model to train and recognize. Main source code is in https://github.com/enkhtogtokh/deepemo repository
翻译:我们建议采用行业级深层学习方法来识别语言情绪。 在行业中,仔细建议的深层转移学习技术显示出实际效果,因为大部分培训数据提供量低、机器培训成本低,以及专门从事AI任务的专门学习。 拟议的深 EMO语音识别框架由两个主要管道组成,如预处理以提取高效的主要特征和深层转移学习模式以进行培训和识别。主要源代码见 https://github.com/enkhtogtokh/deepemo 仓库。