With the development of machine learning techniques, the attention of research has been moved from single-modal learning to multi-modal learning, as real-world data exist in the form of different modalities. However, multi-modal models often carry more information than single-modal models and they are usually applied in sensitive scenarios, such as medical report generation or disease identification. Compared with the existing membership inference against machine learning classifiers, we focus on the problem that the input and output of the multi-modal models are in different modalities, such as image captioning. This work studies the privacy leakage of multi-modal models through the lens of membership inference attack, a process of determining whether a data record involves in the model training process or not. To achieve this, we propose Multi-modal Models Membership Inference (M^4I) with two attack methods to infer the membership status, named metric-based (MB) M^4I and feature-based (FB) M^4I, respectively. More specifically, MB M^4I adopts similarity metrics while attacking to infer target data membership. FB M^4I uses a pre-trained shadow multi-modal feature extractor to achieve the purpose of data inference attack by comparing the similarities from extracted input and output features. Extensive experimental results show that both attack methods can achieve strong performances. Respectively, 72.5% and 94.83% of attack success rates on average can be obtained under unrestricted scenarios. Moreover, we evaluate multiple defense mechanisms against our attacks. The source code of M^4I attacks is publicly available at https://github.com/MultimodalMI/Multimodal-membership-inference.git.
翻译:随着机器学习技术的发展,研究的注意力已经从单一模式学习转向多模式学习,因为真实世界数据以不同模式的形式存在,然而,多模式模型往往含有比单一模式模型更多的信息,通常用于敏感的情景,如医学报告生成或疾病识别。与现有成员对机器学习分类师的推断相比,我们侧重于多模式模型的投入和产出采用不同模式的问题,如图解。这项工作研究的是多模式模型的隐私渗漏,通过成员感知攻击的透视镜,确定数据记录是否涉及模型培训过程。为了实现这一点,我们提议多模式模型成员推导(M&4I),同时使用两种攻击推导方法来推断成员状况,即以网基(MB)M%4I和基于地段(FB) M%4I。更具体地说,MM%4I在攻击中采用相似的度评估方法,同时攻击至更强烈的目标数据成员。FB%4 MI4I使用了一种模拟模型模型提取结果,通过模型提取模型提取数据。