Ensembling neural networks is a long-standing technique for improving the generalization error of neural networks by combining networks with orthogonal properties via a committee decision. We show that this technique is an ideal fit for machine learning on medical data: First, ensembles are amenable to parallel and asynchronous learning, thus enabling efficient training of patient-specific component neural networks. Second, building on the idea of minimizing generalization error by selecting uncorrelated patient-specific networks, we show that one can build an ensemble of a few selected patient-specific models that outperforms a single model trained on much larger pooled datasets. Third, the non-iterative ensemble combination step is an optimal low-dimensional entry point to apply output perturbation to guarantee the privacy of the patient-specific networks. We exemplify our framework of differentially private ensembles on the task of early prediction of sepsis, using real-life intensive care unit data labeled by clinical experts.
翻译:结合神经网络是改进神经网络普遍误差的长期技术,办法是通过委员会的决定,将网络与正向特性结合起来。我们表明,这一技术是适合医学数据方面的机器学习的理想方法:首先,组合可以平行和不同步地学习,从而能够有效地培训病人特定组成部分神经网络。第二,基于通过选择与病情无关的特定网络来尽量减少一般误差的想法,我们表明,我们可以建立少数选定的病人特定模型的组合,这些模型比在大得多的集合数据集方面受过训练的单一模型要强。第三,非显性共同组合步骤是一个最佳的低维切入点,可以应用输出扰动来保证病人特定网络的隐私。我们用临床专家贴上真实的强化护理单位数据,在早期预测败血症的任务中展示了我们差别化的私人圈子框架。