Deep learning systems are typically designed to perform for a wide range of test inputs. For example, deep learning systems in autonomous cars are supposed to deal with traffic situations for which they were not specifically trained. In general, the ability to cope with a broad spectrum of unseen test inputs is called generalization. Generalization is definitely important in applications where the possible test inputs are known but plentiful or simply unknown, but there are also cases where the possible inputs are few and unlabeled but known beforehand. For example, medicine is currently interested in targeting treatments to individual patients; the number of patients at any given time is usually small (typically one), their diagnoses/responses/... are still unknown, but their general characteristics (such as genome information, protein levels in the blood, and so forth) are known before the treatment. We propose to call deep learning in such applications targeted deep learning. In this paper, we introduce a framework for targeted deep learning, and we devise and test an approach for adapting standard pipelines to the requirements of targeted deep learning. The approach is very general yet easy to use: it can be implemented as a simple data-preprocessing step. We demonstrate on a variety of real-world data that our approach can indeed render standard deep learning faster and more accurate when the test inputs are known beforehand.
翻译:深层次的学习系统通常设计用于广泛的测试投入。例如,自主汽车的深层次学习系统应该用来处理交通情况,而他们并没有经过专门培训。一般而言,处理广泛的隐蔽测试投入的能力称为一般化。一般化在可能测试投入为已知但内容丰富或完全未知的应用中无疑很重要,但也有一些可能的投入是为数不多、没有标签但事先已知的。例如,医学目前有兴趣针对个别病人的治疗;任何特定时间的病人人数通常都很小(通常为一种),他们的诊断/反应/.仍然不为人所知,但是在治疗之前,他们的一般特征(如基因组信息、血液中的蛋白质水平等)是众所周知的。我们提议在这类应用中进行深入的深层次学习。我们在本文件中提出一个有针对性的深层次学习的框架,我们设计并测试一种使标准管道适应特定深度学习的要求的方法。这种方法非常普遍,但易于使用:可以作为简单的数据预处理步骤加以实施,但是在治疗之前,他们的一般特性(例如基因组信息、血液中的蛋白质水平等等)是已知的。我们提议在这类应用的深层次上进行更深层次的测试。我们所了解的数据可以更快地展示。