Vertical Federated Learning (VFL) enables multiple data owners, each holding a different subset of features about largely overlapping sets of data sample(s), to jointly train a useful global model. Feature selection (FS) is important to VFL. It is still an open research problem as existing FS works designed for VFL either assumes prior knowledge on the number of noisy features or prior knowledge on the post-training threshold of useful features to be selected, making them unsuitable for practical applications. To bridge this gap, we propose the Federated Stochastic Dual-Gate based Feature Selection (FedSDG-FS) approach. It consists of a Gaussian stochastic dual-gate to efficiently approximate the probability of a feature being selected, with privacy protection through Partially Homomorphic Encryption without a trusted third-party. To reduce overhead, we propose a feature importance initialization method based on Gini impurity, which can accomplish its goals with only two parameter transmissions between the server and the clients. Extensive experiments on both synthetic and real-world datasets show that FedSDG-FS significantly outperforms existing approaches in terms of achieving accurate selection of high-quality features as well as building global models with improved performance.
翻译:垂直联邦学习(VFL)使多个数据拥有者能够拥有多个数据,每个拥有大量重叠的数据抽样组的不同特征,共同培训一个有用的全球模型。功能选择(FS)对于VFL来说很重要。它仍然是一个开放的研究问题,因为为VFL设计的现有的FS工程,要么事先掌握关于所选择的有用特征的吵杂特征的数量的知识,要么事先掌握关于培训后临界值的有用特征的知识,使其不适于实际应用。为了缩小这一差距,我们提议采用基于功能选择的基于功能选择(FedSDG-FS)的复合和现实世界特征选择(FedSDG-FS)法(FedSD-FS)法(FedSD-FS)法(FedSD-FS)法(Feds)法(FedSDSDD-FS)法(FS)法(FDS-FS-FS-FS)法(FDDS)法(FDDS)法(FDDDS-G-GS-G-FS-FSQ-FS-FSQ-FS-FSQ)法,它包括一个高萨citostochchet-SQ-SQ-SQQQ-SQ-SQ-SQ-tochchet-tochet-SQ)法的双重的双重平台,以高效的双重平台,以便有效有效,以便有效地近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近近的双重途径,以有效接近选择方法,以有效接近选择方法,通过部分选择方法,通过部分选择方法,通过部分的概率选择方法,通过部分选择方法,通过部分选择方法,通过部分选择方法,通过部分选择方法,通过部分的隐私选择方法,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方法,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分加密方式,通过部分