Reliably detecting diseases using relevant biological information is crucial for real-world applicability of deep learning techniques in medical imaging. We debias deep learning models during training against unknown bias - without preprocessing/filtering the input beforehand or assuming specific knowledge about its distribution or precise nature in the dataset. We use control regions as surrogates that carry information regarding the bias, employ the classifier model to extract features, and suppress biased intermediate features with our custom, modular DecorreLayer. We evaluate our method on a dataset of 952 lung computed tomography scans by introducing simulated biases w.r.t. reconstruction kernel and noise level and propose including an adversarial test set in evaluations of bias reduction techniques. In a moderately sized model architecture, applying the proposed method to learn from data exhibiting a strong bias, it near-perfectly recovers the classification performance observed when training with corresponding unbiased data.
翻译:利用相关的生物信息可靠地检测疾病对于医学成像中的深层学习技术在现实世界中的适用性至关重要。我们在针对未知偏差的训练中贬低深深学习模式----不事先处理/过滤输入,也不假定对数据集的分布或确切性质有具体了解。我们使用控制区域作为代理机器人,提供与偏差有关的信息,使用分类模型提取特征,并按我们的习惯,即模块化的德科勒拉耶来抑制有偏见的中间特征。我们通过引入模拟偏差重建内核和噪音水平来评估952个肺部计算断层扫描数据集的方法,并提议在对减少偏差技术的评价中包括一个对立测试组。在一个中小的模型结构中,采用拟议的方法从显示强烈偏差的数据中学习,它几乎完全恢复了在用相应的不偏差数据进行培训时观察到的分类性能。