This paper presents our efforts to build a robust ASR model for the shared task Automatic Speech Recognition for spontaneous and prepared speech & Speech Emotion Recognition in Portuguese (SE&R 2022). The goal of the challenge is to advance the ASR research for the Portuguese language, considering prepared and spontaneous speech in different dialects. Our method consist on fine-tuning an ASR model in a domain-specific approach, applying gain normalization and selective noise insertion. The proposed method improved over the strong baseline provided on the test set in 3 of the 4 tracks available
翻译:本文件介绍我们为建立一个强有力的ASR模式所作的努力,该模式旨在建立一个强大的ASR模式,用于共同任务自动语音识别模式,用于葡萄牙语自发和准备的语音和言语情感识别(SE & R 2022),挑战的目标是推动葡萄牙语的ASR研究,同时考虑到以不同方言预先准备和自发的语音。我们的方法是在特定领域的方法中微调ASR模式,采用收益正常化和选择性噪音插入。建议的方法比现有4个轨道中的3个轨道的测试标准提供的强基线有所改进。