Pre-training is essential to deep learning model performance, especially in medical image analysis tasks where limited training data are available. However, existing pre-training methods are inflexible as the pre-trained weights of one model cannot be reused by other network architectures. In this paper, we propose an architecture-irrelevant hyper-initializer, which can initialize any given network architecture well after being pre-trained for only once. The proposed initializer is a hypernetwork which takes a downstream architecture as input graphs and outputs the initialization parameters of the respective architecture. We show the effectiveness and efficiency of the hyper-initializer through extensive experimental results on multiple medical imaging modalities, especially in data-limited fields. Moreover, we prove that the proposed algorithm can be reused as a favorable plug-and-play initializer for any downstream architecture and task (both classification and segmentation) of the same modality.
翻译:培训前培训对于深层次学习模式的绩效至关重要,特别是在培训数据有限的医学图像分析任务方面。但是,现有的培训前方法不灵活,因为一个模型的训练前重量不能被其他网络结构再利用。在本文中,我们提议了一个与建筑有关的超初始化器,可以在仅接受一次培训前就及早启动任何特定的网络结构。拟议的初始化器是一个超级网络,将下游结构作为输入图和产出,作为各个结构的初始化参数。我们通过对多种医学成像模式的广泛实验结果,特别是在数据有限的领域,展示了超初始化器的效益和效率。此外,我们证明,拟议的算法可以被再利用,作为同一模式的任何下游结构和任务(分类和分解)的有利插件初始化器。