We introduce a general approach, called Invariance through Inference, for improving the test-time performance of an agent in deployment environments with unknown perceptual variations. Instead of producing invariant visual features through interpolation, invariance through inference turns adaptation at deployment-time into an unsupervised learning problem. This is achieved in practice by deploying a straightforward algorithm that tries to match the distribution of latent features to the agent's prior experience, without relying on paired data. Although simple, we show that this idea leads to surprising improvements on a variety of adaptation scenarios without access to deployment-time rewards, including changes in scene content, camera poses, and lighting conditions. We present results on challenging domains including distractor control suite and sim-to-real transfer for image-based robot manipulation.
翻译:我们引入了一种一般方法,称为“因推论而无所作为”来改进在部署环境中的代理物的测试-时间性能,其感知差异未知。我们采用的方法不是通过内推而产生变化性视觉特征,而是通过推论而使部署时的适应变成一个不受监督的学习问题。在实践中,我们采用了一种直截了当的算法,这种算法试图将潜在特征的分布与代理物的先前经验相匹配,而不必依靠对称数据。虽然简单,但我们表明,这种想法导致在无法获得部署-时间奖励的情况下,各种适应情景的惊人改进,包括改变现场内容、摄像头姿势和照明条件。我们提出了挑战性领域的成果,包括分散控制套件和图像机器人操纵的模拟到真实传输。