Biometric authentication service providers often claim that it is not possible to reverse-engineer a user's raw biometric sample, such as a fingerprint or a face image, from its mathematical (feature-space) representation. In this paper, we investigate this claim on the specific example of deep neural network (DNN) embeddings. Inversion of DNN embeddings has been investigated for explaining deep image representations or synthesizing normalized images. Existing studies leverage full access to all layers of the original model, as well as all possible information on the original dataset. For the biometric authentication use case, we need to investigate this under adversarial settings where an attacker has access to a feature-space representation but no direct access to the exact original dataset nor the original learned model. Instead, we assume varying degree of attacker's background knowledge about the distribution of the dataset as well as the original learned model (architecture and training process). In these cases, we show that the attacker can exploit off-the-shelf DNN models and public datasets, to mimic the behaviour of the original learned model to varying degrees of success, based only on the obtained representation and attacker's prior knowledge. We propose a two-pronged attack that first infers the original DNN by exploiting the model footprint on the embedding, and then reconstructs the raw data by using the inferred model. We show the practicality of the attack on popular DNNs trained for two prominent biometric modalities, face and fingerprint recognition. The attack can effectively infer the original recognition model (mean accuracy 83\% for faces, 86\% for fingerprints), and can craft effective biometric reconstructions that are successfully authenticated with 1-vs-1 authentication accuracy of up to 92\% for some models.
翻译:生物识别认证服务提供商通常声称从生物特征空间的数学表示(如指纹或人脸图像)反向工程用户的原始生物特征样本是不可能的。本文就深度神经网络(DNN)嵌入的对抗性反演问题进行了研究。DNN嵌入的反演已被研究用于解释深度图像表示或合成归一化图像。现有的研究利用了对原始模型的所有层的完全访问,以及对原始数据集的所有可能信息。对于生物识别认证用例,我们需要在敌对设置下进行研究,其中攻击者可以访问特征空间表示,但没有直接访问准确的原始数据集或原始学习模型。相反,我们假定攻击者对数据集和原始学习模型的分布有不同程度的背景知识。在这些情况下,我们表明攻击者可以利用现成的DNN模型和公共数据集,仅基于获得的表示和攻击者的先前知识,来模仿原始学习模型的行为。我们提出了一个两步攻击,首先利用嵌入的模型足迹推断出原始DNN,然后使用推断出的模型重建原始数据。我们展示了攻击在为两个突出的生物特征模态(人脸和指纹识别)训练的流行DNN上的实用性。攻击可以有效地推断出原始识别模型(人脸平均准确率83%,指纹平均准确率86%),并可以制造出成功被认证的有效生物特征重建,对于一些模型的1v1认证准确率高达92%。