合成音频取证评估（SAFE）挑战赛 (Synthetic Audio Forensics Evaluation (SAFE) Challenge)

The increasing realism of synthetic speech generated by advanced text-to-speech (TTS) models, coupled with post-processing and laundering techniques, presents a significant challenge for audio forensic detection. In this paper, we introduce the SAFE (Synthetic Audio Forensics Evaluation) Challenge, a fully blind evaluation framework designed to benchmark detection models across progressively harder scenarios: raw synthetic speech, processed audio (e.g., compression, resampling), and laundered audio intended to evade forensic analysis. The SAFE challenge consisted of a total of 90 hours of audio and 21,000 audio samples split across 21 different real sources and 17 different TTS models and 3 tasks. We present the challenge, evaluation design and tasks, dataset details, and initial insights into the strengths and limitations of current approaches, offering a foundation for advancing synthetic audio detection research. More information is available at \href{https://stresearch.github.io/SAFE/}{https://stresearch.github.io/SAFE/}.

翻译：随着先进文本转语音（TTS）模型生成的合成语音日益逼真，加之后处理和洗白技术的应用，音频取证检测面临重大挑战。本文介绍了SAFE（合成音频取证评估）挑战赛，这是一个完全盲测的评估框架，旨在对检测模型在逐步加难的场景中进行基准测试：原始合成语音、经处理的音频（如压缩、重采样）以及旨在规避取证分析的洗白音频。SAFE挑战赛共包含90小时音频和21,000个音频样本，涵盖21个不同真实来源和17种不同TTS模型，并设置了3项任务。我们介绍了挑战赛的总体情况、评估设计与任务、数据集细节，并对当前方法的优势与局限性提供了初步见解，为推进合成音频检测研究奠定了基础。更多信息请访问 \href{https://stresearch.github.io/SAFE/}{https://stresearch.github.io/SAFE/}。