In this paper, we explore a novel knowledge-transfer task, termed as Deep Model Reassembly (DeRy), for general-purpose model reuse. Given a collection of heterogeneous models pre-trained from distinct sources and with diverse architectures, the goal of DeRy, as its name implies, is to first dissect each model into distinctive building blocks, and then selectively reassemble the derived blocks to produce customized networks under both the hardware resource and performance constraints. Such ambitious nature of DeRy inevitably imposes significant challenges, including, in the first place, the feasibility of its solution. We strive to showcase that, through a dedicated paradigm proposed in this paper, DeRy can be made not only possibly but practically efficiently. Specifically, we conduct the partitions of all pre-trained networks jointly via a cover set optimization, and derive a number of equivalence set, within each of which the network blocks are treated as functionally equivalent and hence interchangeable. The equivalence sets learned in this way, in turn, enable picking and assembling blocks to customize networks subject to certain constraints, which is achieved via solving an integer program backed up with a training-free proxy to estimate the task performance. The reassembled models, give rise to gratifying performances with the user-specified constraints satisfied. We demonstrate that on ImageNet, the best reassemble model achieves 78.6% top-1 accuracy without fine-tuning, which could be further elevated to 83.2% with end-to-end training. Our code is available at https://github.com/Adamdad/DeRy
翻译:在本文中,我们探索了一个新的知识转让任务,称为深模重(DeRy ), 用于通用模式再利用。鉴于由不同来源和不同结构预先培训的多样化模型集,DeRy(其名称意味着,DeRy)的目标是首先将每个模型分解成不同的构件,然后有选择地重新组装衍生的构件,以便在硬件资源和性能限制下建立定制的网络。DeRy的这种雄心勃勃的性质必然带来重大挑战,首先包括其解决方案的可行性。我们努力通过本文中提议的专门范例展示,DeRy不仅有可能而且实际上可以有效地实现。具体地说,我们通过一个套套件优化,将所有预先培训的网络分割成不同的构件,并产生一系列等效装置,其中每个网块在功能上都被视为等效,因此可以互换。从中学到的等效装置,反过来又能够将一些块拆解和组装成具有某些制约的网络。我们通过解决一个整形程序,而无需经过精细培训的代理来估算任务绩效。我们用高精度的标度A的升级的模型来进行升级的升级。我们要的模型进行升级的升级的升级的模型,让我们的升级的升级的图像的模型能够满足。