Audio analysis for forensic speaker verification offers unique challenges in system performance due in part to data collected in naturalistic field acoustic environments where location/scenario uncertainty is common in the forensic data collection process. Forensic speech data as potential evidence can be obtained in random naturalistic environments resulting in variable data quality. Speech samples may include variability due to vocal efforts such as yelling over 911 emergency calls, whereas others might be whisper or situational stressed voice in a field location or interview room. Such speech variability consists of intrinsic and extrinsic characteristics and makes forensic speaker verification a complicated and daunting task. Extrinsic properties include recording equipment such as microphone type and placement, ambient noise, room configuration including reverberation, and other environmental scenario-based issues. Some factors, such as noise and non-target speech, will impact the verification system performance by their mere presence. To investigate the impact of field acoustic environments, we performed a speaker verification study based on the CRSS-Forensic corpus with audio collected from 8 field locations including police interviews. This investigation includes an analysis of the impact of seven unseen acoustic environments on speaker verification system performance using an x-Vector system.
翻译:法医演讲者核查的音频分析在系统性能方面提出了独特的挑战,部分原因是在自然界现场声学环境中收集的数据,在法医数据收集过程中,位置/情景不确定性是常见的。法医演讲数据作为潜在证据,可以在随机的自然环境中获得,从而导致数据质量的变异性。发言样本可能包括声响努力造成的变异性,如对911紧急呼叫大喊大叫,而其他人则可能在现场或面谈室低声或以情态压力发出声音。这种语音变异性由内在和外在特点组成,使法医演讲者核查是一项复杂而艰巨的任务。外在特性包括麦克风类型和位置、环境噪音、房间配置(包括回响)和其他基于环境的假设问题。一些因素,如噪音和非目标演讲,将仅仅通过它们的存在影响核查制度的运行。为了调查现场声响环境的影响,我们根据CRISS-Forensic软件进行了一个发言者核查研究,从8个外地地点收集的音频,包括警察访谈。这项调查包括分析7个看不见的声响环境对使用x-Vexctor系统进行声频核查系统对发言者系统工作的影响。