As a decentralized training approach, federated learning enables multiple organizations to jointly train a model without exposing their private data. This work investigates vertical federated learning (VFL) to address scenarios where collaborating organizations have the same set of users but with different features, and only one party holds the labels. While VFL shows good performance, practitioners often face uncertainty when preparing non-transparent, internal/external features and samples for the VFL training phase. Moreover, to balance the prediction accuracy and the resource consumption of model inference, practitioners require to know which subset of prediction instances is genuinely needed to invoke the VFL model for inference. To this end, we co-design the VFL modeling process by proposing an interactive real-time visualization system, VFLens, to help practitioners with feature engineering, sample selection, and inference. A usage scenario, a quantitative experiment, and expert feedback suggest that VFLens helps practitioners boost VFL efficiency at a lower cost with sufficient confidence.
翻译:作为分散化的培训方法,联合会学习使多个组织能够在不披露其私人数据的情况下联合培训模型。这项工作调查纵向联合学习(VFL),以解决合作组织拥有相同用户但具有不同特点的情景,只有一方持有标签。虽然VFL表现良好,但实践者在为VFL培训阶段准备不透明、内部/外部特征和样本时往往面临不确定性。此外,为了平衡模型推断的预测准确性和资源消耗,实践者需要知道真正需要哪组预测实例才能援引VFLF模型进行推断。为此,我们共同设计VFLL模型进程,提出互动式实时可视化系统(VFLLens),以帮助实践者进行特征工程、抽样选择和推断。使用情景、定量实验和专家反馈表明,VFLLens帮助实践者以足够自信低的成本提高VFL的效益。