Vertical federated learning is considered, where an active party, having access to true class labels, wishes to build a classification model by utilizing more features from a passive party, which has no access to the labels, to improve the model accuracy. In the prediction phase, with logistic regression as the classification model, several inference attack techniques are proposed that the adversary, i.e., the active party, can employ to reconstruct the passive party's features, regarded as sensitive information. These attacks, which are mainly based on a classical notion of the center of a set, i.e., the Chebyshev center, are shown to be superior to those proposed in the literature. Moreover, several theoretical performance guarantees are provided for the aforementioned attacks. Subsequently, we consider the minimum amount of information that the adversary needs to fully reconstruct the passive party's features. In particular, it is shown that when the passive party holds one feature, and the adversary is only aware of the signs of the parameters involved, it can perfectly reconstruct that feature when the number of predictions is large enough. Next, as a defense mechanism, a privacy-preserving scheme is proposed that worsen the adversary's reconstruction attacks, while preserving the full benefits that VFL brings to the active party. Finally, experimental results demonstrate the effectiveness of the proposed attacks and the privacy-preserving scheme.
翻译:考虑纵向联盟学习,如果一个活跃方能够接触到真正的阶级标签,希望通过利用被动方(无法接触标签)的更多特征来建立分类模型,以提高模型的准确性。在预测阶段,以后勤回归为分类模型,提出了几种推论攻击技术,即活跃方可以用来重建被动方特征,被视为敏感信息。这些攻击主要基于一套预测的中心(即Chebyshev中心)的经典概念,显示其优于文献中提议的内容。此外,为上述攻击提供了若干理论上的履约保证。随后,我们考虑了敌对方需要的最低限度信息,以充分重建被动方特征。具体地说,当被动方持有一个特征,而且敌对方只知道所涉参数的迹象时,当预测的数量足够大时,它可以完美地重建这一特征。下一个是,作为防御机制,为上述攻击提供了若干理论上的履约保证。随后,我们考虑了敌对方需要多少信息才能充分重建被动方特征。具体地表明,当被动方持有一个特征,而敌对方只知道所涉参数的迹象迹象时,当预测的数量足够大时,即Chebyshev中心,它可以完美地重建这一特征。 计划旨在维护隐私,然后提议使主动防御计划能够使敌对方的进攻最终显示持续计划的效果恶化。