Federated learning, which shares the weights of the neural network across clients, is gaining attention in the healthcare sector as it enables training on a large corpus of decentralized data while maintaining data privacy. For example, this enables neural network training for COVID-19 diagnosis on chest X-ray (CXR) images without collecting patient CXR data across multiple hospitals. Unfortunately, the exchange of the weights quickly consumes the network bandwidth if highly expressive network architecture is employed. So-called split learning partially solves this problem by dividing a neural network into a client and a server part, so that the client part of the network takes up less extensive computation resources and bandwidth. However, it is not clear how to find the optimal split without sacrificing the overall network performance. To amalgamate these methods and thereby maximize their distinct strengths, here we show that the Vision Transformer, a recently developed deep learning architecture with straightforward decomposable configuration, is ideally suitable for split learning without sacrificing performance. Even under the non-independent and identically distributed data distribution which emulates a real collaboration between hospitals using CXR datasets from multiple sources, the proposed framework was able to attain performance comparable to data-centralized training. In addition, the proposed framework along with heterogeneous multi-task clients also improves individual task performances including the diagnosis of COVID-19, eliminating the need for sharing large weights with innumerable parameters. Our results affirm the suitability of Transformer for collaborative learning in medical imaging and pave the way forward for future real-world implementations.
翻译:联邦学习在客户之间分担神经网络的权重,在保健部门日益引起注意,因为联邦学习有助于在保留数据隐私的同时,对大量分散的数据进行大量分散化数据的培训。例如,这样可以在不收集多家医院的病人CXR数据的情况下,对胸前X光(CXR)图像进行神经网络培训,对COVID-19诊断进行胸前X光(CXR)图像诊断;不幸的是,如果采用高度直观的网络结构,重量的交换会迅速消耗网络带宽。所谓的分解学习部分通过将神经网络分为客户和服务器部分来解决这个问题,从而使网络的客户获得较少的计算资源和带宽度。然而,尚不清楚如何找到最佳的分解方法,而不牺牲整个网络的性能。在这里,我们表明,如果最近开发的深度学习结构具有直接易变的铺路面结构,则会很快消耗网络带宽广的学习能力。 即便在不依赖和同样分布的数据分配方面,即使根据CXR数据集成的多种来源的计算参数和带宽度,网络的客户也很难找到最佳的分解分解方法。 拟议的框架可以与未来的交付业绩,包括CODxxxxxxx的进度,为了提高的进度,拟议的格式,也能够改进业绩,从而改进了我们未来的分析所需的分解。