Conditional computation and modular networks have been recently proposed for multitask learning and other problems as a way to decompose problem solving into multiple reusable computational blocks. We propose a new approach for learning modular networks based on the isometric version of ResNet with all residual blocks having the same configuration and the same number of parameters. This architectural choice allows adding, removing and changing the order of residual blocks. In our method, the modules can be invoked repeatedly and allow knowledge transfer to novel tasks by adjusting the order of computation. This allows soft weight sharing between tasks with only a small increase in the number of parameters. We show that our method leads to interpretable self-organization of modules in case of multi-task learning, transfer learning and domain adaptation while achieving competitive results on those tasks. From practical perspective, our approach allows to: (a) reuse existing modules for learning new task by adjusting the computation order, (b) use it for unsupervised multi-source domain adaptation to illustrate that adaptation to unseen data can be achieved by only manipulating the order of pretrained modules, (c) show how our approach can be used to increase accuracy of existing architectures for image classification tasks such as ImageNet, without any parameter increase, by reusing the same block multiple times.
翻译:最近为多任务学习和其他问题提议了条件计算和模块网络,作为将解决问题分解成多可重复使用的计算区块的方法。我们提出一种新的方法,用于学习基于ResNet的等度版模块网络的模块网络,而ResNet的所有残余区块具有相同的配置和参数数目相同。这种建筑选择允许添加、删除和改变剩余区块的顺序。在我们的方法中,模块可以反复引用,允许通过调整计算顺序将知识转移到新任务中。这样可以允许任务之间的软权重共享,而参数数目仅小幅增加。我们表明,在多任务学习、转移学习和领域适应同时取得竞争性结果的情况下,我们的方法可以导致模块的可解释自我组织。从实际角度看,我们的方法允许:(a) 重新使用现有的模块,通过调整计算顺序来学习新任务;(b) 使用这些模块进行非超导式的多源域适应,以说明只能通过调整预设模块的顺序来适应无形数据,(c) 显示我们的方法可以用来提高现有结构的精确度,而不用像像图像一样增加时间。