In this paper, we propose a privacy-preserving image classification method that is based on the combined use of encrypted images and the vision transformer (ViT). The proposed method allows us not only to apply images without visual information to ViT models for both training and testing but to also maintain a high classification accuracy. ViT utilizes patch embedding and position embedding for image patches, so this architecture is shown to reduce the influence of block-wise image transformation. In an experiment, the proposed method for privacy-preserving image classification is demonstrated to outperform state-of-the-art methods in terms of classification accuracy and robustness against various attacks.
翻译:在本文中,我们提出了一个基于加密图像和视觉变压器(VYT)合并使用的保护隐私图像分类方法。 提议的方法不仅使我们能够将没有视觉信息的图像应用到 ViT 模型中,用于培训和测试,而且还可以保持高分类准确性。 ViT 使用补丁嵌入和定位嵌入图像补丁,因此这个结构可以减少块状图像转换的影响。 在一项实验中,拟议的保护隐私图像分类方法在分类准确性和稳健性方面超过了各种攻击的最先进的方法。