For new participants - Executive summary: (1) The task is to develop a voice anonymization system for speech data which conceals the speaker's voice identity while protecting linguistic content, paralinguistic attributes, intelligibility and naturalness. (2) Training, development and evaluation datasets are provided in addition to 3 different baseline anonymization systems, evaluation scripts, and metrics. Participants apply their developed anonymization systems, run evaluation scripts and submit objective evaluation results and anonymized speech data to the organizers. (3) Results will be presented at a workshop held in conjunction with INTERSPEECH 2022 to which all participants are invited to present their challenge systems and to submit additional workshop papers. For readers familiar with the VoicePrivacy Challenge - Changes w.r.t. 2020: (1) A stronger, semi-informed attack model in the form of an automatic speaker verification (ASV) system trained on anonymized (per-utterance) speech data. (2) Complementary metrics comprising the equal error rate (EER) as a privacy metric, the word error rate (WER) as a primary utility metric, and the pitch correlation and gain of voice distinctiveness as secondary utility metrics. (3) A new ranking policy based upon a set of minimum target privacy requirements.
翻译:对于新的参与者,执行摘要:(1) 任务是为语言数据开发语音匿名系统,在保护语言内容、语言特征、智能和自然性的同时,隐藏发言者的语音身份。 (2) 除3个不同的基线匿名系统、评价脚本和指标外,还提供培训、开发和评价数据集。参与者应用他们开发的匿名系统,运行评价脚本,向组织者提交客观的评价结果和匿名的语音数据。 (3) 将在与2022年InterSPEECH合作举办的讲习班上介绍结果,邀请所有参与者介绍其挑战系统并提交更多的讲习班文件。对于熟悉语音隐私挑战的读者,2020年版本包括:(1) 以自动语音校验(ASV)系统的形式,对匿名(每次通缩)语音数据进行培训,以这种系统为语言进行强化、半知情的攻击模式。(2) 包括等误率作为隐私度、单词误率作为主要通用度、以最低通用度为基础的语音标准,以及以最低通用度为基准的语音要求的定位。(3) 以最低通用度为基准的精确度标准。