多语种视听智能电话数据集和评价 (Multilingual Audio-Visual Smartphone Dataset And Evaluation)

Smartphones have been employed with biometric-based verification systems to provide security in highly sensitive applications. Audio-visual biometrics are getting popular due to the usability and also it will be challenging to spoof because of multi-modal nature. In this work, we present an audio-visual smartphone dataset captured in five different recent smartphones. This new dataset contains 103 subjects captured in three different sessions considering the different real-world scenarios. Three different languages are acquired in this dataset to include the problem of language dependency of the speaker recognition systems. These unique characteristics of this dataset will pave the way to implement novel state-of-the-art unimodal or audio-visual speaker recognition systems. We also report the performance of the bench-marked biometric verification systems on our dataset. The robustness of biometric algorithms is evaluated towards multiple dependencies like signal noise, device, language and presentation attacks like replay and synthesized signals with extensive experiments. The obtained results raised many concerns about the generalization properties of state-of-the-art biometrics methods in smartphones.

翻译：以生物鉴别为基础的核查系统使用智能手机,为高度敏感应用提供安全保障。由于可用性,视听生物鉴别技术越来越受欢迎,而且由于多式性质,使用起来也具有挑战性。在这项工作中,我们展示了在最近五部不同的智能手机中捕捉的视听智能电话数据集。这个新数据集包含在三次不同的会议中捕捉的103个主题,其中考虑到不同的现实世界情景。在这个数据集中获取了三种不同的语言,以包括扬声器识别系统的语言依赖性问题。这个数据集的这些独特特点将为执行新颖的最新单式或视听语音语音识别系统铺平道路。我们还在我们的数据集中报告了专门设计的生物鉴别系统的性能。根据多种依赖性,例如信号噪音、装置、语言和演示攻击,例如用广泛的实验重放和合成信号,对生物鉴别方法在智能手机中的通用性提出了许多关切。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【微软】自动机器学习系统，70页ppt

专知会员服务

72+阅读 · 2021年6月28日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日