关于语音到文字系统自监督模式中的性别影响的研究 (A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems)

Self-supervised models for speech processing emerged recently as popular foundation blocks in speech processing pipelines. These models are pre-trained on unlabeled audio data and then used in speech processing downstream tasks such as automatic speech recognition (ASR) or speech translation (ST). Since these models are now used in research and industrial systems alike, it becomes necessary to understand the impact caused by some features such as gender distribution within pre-training data. Using French as our investigation language, we train and compare gender-specific wav2vec 2.0 models against models containing different degrees of gender balance in their pre-training data. The comparison is performed by applying these models to two speech-to-text downstream tasks: ASR and ST. Results show the type of downstream integration matters. We observe lower overall performance using gender-specific pre-training before fine-tuning an end-to-end ASR system. However, when self-supervised models are used as feature extractors, the overall ASR and ST results follow more complex patterns in which the balanced pre-trained model does not necessarily lead to the best results. Lastly, our crude 'fairness' metric, the relative performance difference measured between female and male test sets, does not display a strong variation from balanced to gender-specific pre-trained wav2vec 2.0 models.

翻译：语音处理的自我监督模式最近成为语音处理管道中受欢迎的基石。这些模式在未贴标签的音频数据上经过预先培训,然后用于语言处理下游任务,如自动语音识别(ASR)或语音翻译(ST)等。由于这些模式现在在研究和工业系统中同样使用,因此有必要理解培训前数据中性别分布等特征的影响。使用法语作为我们调查语言,我们培训和比较针对性别的wav2vec 2.0模式,而培训前数据中包含不同程度性别平衡的模型。比较是通过将这些模型应用于两种语音对文本的下游任务:ASR和ST。结果显示下游一体化事项的类型。我们观察的是,在微调培训前使用针对不同性别的预培训前系统之前,总体绩效较低。然而,当自我监督的模式被用作特征提取器时,总体的ASR和ST结果遵循了更为复杂的模式,其中平衡的预培训前模式不一定导致最佳结果。最后,我们粗略的“公平度”衡量标准,相对的绩效差异是女性与2.0之前的性别差异。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日