DIRL: 实至实转让的内 - 内 - 内代表制学习 (DIRL: Domain-Invariant Representation Learning for Sim-to-Real Transfer)

Generating large-scale synthetic data in simulation is a feasible alternative to collecting/labelling real data for training vision-based deep learning models, albeit the modelling inaccuracies do not generalize to the physical world. In this paper, we present a domain-invariant representation learning (DIRL) algorithm to adapt deep models to the physical environment with a small amount of real data. Existing approaches that only mitigate the covariate shift by aligning the marginal distributions across the domains and assume the conditional distributions to be domain-invariant can lead to ambiguous transfer in real scenarios. We propose to jointly align the marginal (input domains) and the conditional (output labels) distributions to mitigate the covariate and the conditional shift across the domains with adversarial learning, and combine it with a triplet distribution loss to make the conditional distributions disjoint in the shared feature space. Experiments on digit domains yield state-of-the-art performance on challenging benchmarks, while sim-to-real transfer of object recognition for vision-based decluttering with a mobile robot improves from 26.8 % to 91.0 %, resulting in 86.5 % grasping accuracy of a wide variety of objects. Code and supplementary details are available at https://sites.google.com/view/dirl

翻译：模拟中生成大规模合成数据是收集/标签真实数据的一个可行的替代办法,用于为基于愿景的深学习模型收集/标签真实数据,用于培训基于愿景的深层次学习模型,尽管建模不准确并不向物理世界普及。在本文件中,我们提出了一个域-异位代表学习(DIRL)算法,以利用少量真实数据使深层模型适应物理环境。现有的方法只能通过调整各域的边际分布和假设有条件分布为域-异域,来减缓共变变化,而有条件分布在实际情景中会导致模糊的转移。我们提议将边际(投入域)和有条件(产出标签)分布联合对齐,以缓解对立式学习在跨区域之间的共变和有条件变化,并将它与三重分布损失结合起来,使条件分布在共享功能空间中脱钩。对数字域的实验在具有挑战性的基准上产生状态-优异性性性表现,同时将基于视像的物体识别和移动机器人的物体识别从26.8%提高到91.0%和有条件的(产出标签)分布式的分布式目标从86.5%/精确度改进为86.5/可获取的版本。

相关内容

表示学习

关注 186

表示学习是通过利用训练数据来学习得到向量表示，这可以克服人工方法的局限性。表示学习通常可分为两大类，无监督和有监督表示学习。大多数无监督表示学习方法利用自动编码器（如去噪自动编码器和稀疏自动编码器等）中的隐变量作为表示。目前出现的变分自动编码器能够更好的容忍噪声和异常值。然而，推断给定数据的潜在结构几乎是不可能的。目前有一些近似推断的策略。此外，一些无监督表示学习方法旨在近似某种特定的相似性度量。提出了一种无监督的相似性保持表示学习框架，该框架使用矩阵分解来保持成对的DTW相似性。通过学习保持DTW的shaplets，即在转换后的空间中的欧式距离近似原始数据的真实DTW距离。有监督表示学习方法可以利用数据的标签信息，更好地捕获数据的语义结构。孪生网络和三元组网络是目前两种比较流行的模型，它们的目标是最大化类别之间的距离并最小化了类别内部的距离。

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知会员服务

37+阅读 · 2020年2月27日