The increasing volume of electronic health records (EHRs) across healthcare institutions presents the opportunity to enhance model accuracy and robustness in clinical prediction tasks. Federated learning enables training on data from multiple institutions while preserving patient privacy and complying to regulatory constraints. However, most federated learning research focuses on constructing a global model for multiple clients, overlooking the practical need for institution-specific models. In this work, we introduce EHRFL, a federated learning framework using EHRs designed to develop a model tailored to a single healthcare institution. Our framework addresses two key challenges: (1) enabling federated learning across institutions with heterogeneous EHR systems using text-based EHR modeling, and (2) reducing the costs associated with federated learning by selecting suitable participating clients using averaged patient embeddings, which enables optimizing the number of participants without compromising model performance for the institution. Our experiment results on multiple open-source EHR datasets demonstrate the effectiveness of EHRFL in addressing the two challenges, establishing it as a practical solution for institution-specific model development in federated learning.
翻译:暂无翻译