Vertical federated learning is considered, where an active party, having access to true class labels, wishes to build a classification model by utilizing more features from a passive party, which has no access to the labels, to improve the model accuracy. In the prediction phase, with logistic regression as the classification model, several inference attack techniques are proposed that the adversary, i.e., the active party, can employ to reconstruct the passive party's features, regarded as sensitive information. These attacks, which are mainly based on a classical notion of the center of a set, i.e., the Chebyshev center, are shown to be superior to those proposed in the literature. Moreover, several theoretical performance guarantees are provided for the aforementioned attacks. Subsequently, we consider the minimum amount of information that the adversary needs to fully reconstruct the passive party's features. In particular, it is shown that when the passive party holds one feature, and the adversary is only aware of the signs of the parameters involved, it can perfectly reconstruct that feature when the number of predictions is large enough. Next, as a defense mechanism, two privacy-preserving schemes are proposed that worsen the adversary's reconstruction attacks, while preserving the full benefits that VFL brings to the active party. Finally, experimental results demonstrate the effectiveness of the proposed attacks and the privacy-preserving schemes.
翻译:考虑纵向联盟学习,如果活跃方能够接触到真正的阶级标签,希望通过利用被动方(无法接触标签)的更多特征来建立分类模式,以提高模型的准确性。在预测阶段,以后勤回归为分类模型,提出了几种推论攻击技术,即活跃方可以用来重建被动方特征,被视为敏感信息。这些袭击主要基于一组预测中心(即Chebyshev中心)的经典概念,显示其优于文献中提议的内容。此外,为上述袭击提供了若干理论上的履约保证。随后,我们考虑敌对方需要多少信息才能充分重建被动方特征。具体地说,当被动方持有一个特征,而且敌对方只知道所涉参数的迹象时,当预测数量足够大时,它能够完美地重建该特征。接下来,作为防御机制,两个隐私保存计划为上述袭击提供了若干理论性绩效保证。我们考虑对手需要多少信息来充分重建被动方特征。具体地表明,当被动方持有一个特征,而敌对方只知道所涉参数的迹象,当预测数量足够时,它可以完美地重建这一特征。最后,拟议中的侵略性攻击将带来积极性攻击的效果恶化。