In the context of the current global pandemic and the limitations of the RT-PCR test, we propose a novel deep learning architecture, DFCN (Denoising Fully Connected Network). Since medical facilities around the world differ enormously in what laboratory tests or chest imaging may be available, DFCN is designed to be robust to missing input data. An ablation study extensively evaluates the performance benefits of the DFCN as well as its robustness to missing inputs. Data from 1088 patients with confirmed RT-PCR results are obtained from two independent medical facilities. The data includes results from 27 laboratory tests and a chest x-ray scored by a deep learning model. Training and test datasets are taken from different medical facilities. Data is made publicly available. The performance of DFCN in predicting the RT-PCR result is compared with 3 related architectures as well as a Random Forest baseline. All models are trained with varying levels of masked input data to encourage robustness to missing inputs. Missing data is simulated at test time by masking inputs randomly. DFCN outperforms all other models with statistical significance using random subsets of input data with 2-27 available inputs. When all 28 inputs are available DFCN obtains an AUC of 0.924, higher than any other model. Furthermore, with clinically meaningful subsets of parameters consisting of just 6 and 7 inputs respectively, DFCN achieves higher AUCs than any other model, with values of 0.909 and 0.919.
翻译:在当前全球大流行病和RT-PCR测试的局限性的背景下,我们提出一个新的深层次学习结构,即DFCN(DFCN);由于世界各地的医疗设施在实验室测试或胸前成像方面差异巨大,DFCN旨在对缺失输入数据进行强健;广泛评价DFCN的性能效益以及其坚固度与缺失输入值之间的对比;从两个独立的医疗设施获取了1088名确证RT-PCR结果的病人的数据;数据包括27个实验室测试的结果和通过深层学习模型获得的胸部X射线;培训和测试数据集来自不同的医疗设施;数据公开提供;DFCN预测RT-PCR结果的性能与3个相关结构以及随机森林基线相比较;所有模型都经过不同程度的掩码化输入数据培训,鼓励对缺失输入进行稳健的模型;在测试时通过随机掩埋输入,对7个缺失数据进行模拟;DFCN仅仅将所有其他具有统计意义的模型比6CN高的模型,分别使用A-PCRFS的随机分数和BS-28号临床投入。