Learning a generalizable deep model from a few examples in a short time remains a major challenge of machine learning, which has impeded its wide deployment to many scenarios. Recent advances reveal that a properly pre-trained model endows an important property: transferability. A higher transferability of the learned representations indicates a better generalizability across domains of different distributions (domain transferability), or across tasks of different semantics (task transferability). Transferability has become the key to enable data-efficient deep learning, however, existing pre-training methods focus only on the domain transferability while meta-training methods only on the task transferability. This restricts their data-efficiency in downstream scenarios of diverging domains and tasks. A finding of this paper is that even a tight combination of pre-training and meta-training cannot achieve both kinds of transferability. This motivates the proposed Omni-Training framework towards data-efficient deep learning. Our first contribution is Omni-Net, a tri-flow architecture. Besides the joint representation flow, Omni-Net introduces two new parallel flows for pre-training and meta-training, respectively responsible for learning representations of domain transferability and task transferability. Omni-Net coordinates the parallel flows by routing them via the joint-flow, making each gain the other kind of transferability. Our second contribution is Omni-Loss, in which a mean-teacher regularization is imposed to learn generalizable and stabilized representations. Omni-Training is a general framework that accommodates many existing pre-training and meta-training algorithms. A thorough evaluation on cross-task and cross-domain datasets in classification, regression and reinforcement learning problems shows that Omni-Training consistently outperforms the state-of-the-art methods.
翻译:短期内从几个例子中学习一个通用的深层次模型仍然是机械学习的一大挑战,这阻碍了机器学习的广泛应用。最近的进展表明,一个经过适当培训的模型具有重要的属性:可转移性。学习的演示的更可转移性表明,在不同分布领域(可转移性)或不同语义(任务可转移性)任务(任务可转移性)领域之间更加普遍化。可转移性已成为使数据高效深层次学习的关键。然而,现有的培训前方法仅注重域间可转移性,而元培训方法则仅注重任务可转移性。这限制了它们在不同领域和任务的下游情景下游中的数据效率。本文的一项发现是,即使培训前和元培训的紧密结合,也无法实现两种类型的可转移性。我们的第一个贡献是Omni-Net,一个三流结构。除了联合代表流动外,Omni-Net为培训和元培训前和元培训提供了两种新的平行流程流程,分别负责学习域间可转移性和任务可转移性统性结构,另一个是学习的可转移性流动,另一个是学习性数据流,另一个是学习性流动。