Nowadays, non-privacy small-scale motion detection has attracted an increasing amount of research in remote sensing in speech recognition. These new modalities are employed to enhance and restore speech information from speakers of multiple types of data. In this paper, we propose a dataset contains 7.5 GHz Channel Impulse Response (CIR) data from ultra-wideband (UWB) radars, 77-GHz frequency modulated continuous wave (FMCW) data from millimetre wave (mmWave) radar, and laser data. Meanwhile, a depth camera is adopted to record the landmarks of the subject's lip and voice. Approximately 400 minutes of annotated speech profiles are provided, which are collected from 20 participants speaking 5 vowels, 15 words and 16 sentences. The dataset has been validated and has potential for the research of lip reading and multimodal speech recognition.
翻译:目前,在语音识别方面,非原始小型运动探测已吸引了越来越多的遥感研究,这些新模式用于加强和恢复多种数据类型发言者的语音信息,本文建议数据集包含超广域雷达(UWB)提供的7.5千兆赫频道脉冲反应(CIR)数据、毫米波雷达(mmWave)雷达(MFCW)调制的77千兆赫频率连续波数据以及激光数据,同时采用深度照相机记录主题的嘴唇和声音的里程碑。提供了约400分钟附加说明的语音简介,从20名讲5个发音、15个字和16个句的与会者那里收集,数据集已被验证,并有可能用于对唇读和多式语音识别进行研究。</s>