We apply deep learning (DL) on Magnetic resonance spectroscopy (MRS) data for the task of brain tumor detection. Medical applications often suffer from data scarcity and corruption by noise. Both of these problems are prominent in our data set. Furthermore, a varying number of spectra are available for the different patients. We address these issues by considering the task as a multiple instance learning (MIL) problem. Specifically, we aggregate multiple spectra from the same patient into a "bag" for classification and apply data augmentation techniques. To achieve the permutation invariance during the process of bagging, we proposed two approaches: (1) to apply min-, max-, and average-pooling on the features of all samples in one bag and (2) to apply an attention mechanism. We tested these two approaches on multiple neural network architectures. We demonstrate that classification performance is significantly improved when training on multiple instances rather than single spectra. We propose a simple oversampling data augmentation method and show that it could further improve the performance. Finally, we demonstrate that our proposed model outperforms manual classification by neuroradiologists according to most performance metrics.
翻译:我们运用磁共振光谱学(DL)的深度学习数据来探测脑肿瘤。医疗应用常常因噪音而缺乏数据和腐败。这两个问题在我们的数据集中都很突出。此外,不同的病人可以获得不同数量的光谱。我们通过将这项任务视为多重实例学习(MIL)问题来解决这些问题。具体地说,我们将同一病人的多重光谱集中到一个“包”中进行分类和应用数据增强技术。为了在包装过程中实现变异,我们建议了两种方法:(1) 在一个包中对所有样品的特性应用最小、最大和平均集合,(2) 应用注意机制。我们在多个神经网络结构中测试了这两种方法。我们证明,如果在多个实例而不是单个光谱中进行培训,分类的性能会大大提高。我们建议一种简单的过度抽样数据增强方法,并表明它可以进一步改进性能。最后,我们证明我们提议的模型比大多数性能测量的神经学家的人工分类要强。