Differentiable simulation is a promising toolkit for fast gradient-based policy optimization and system identification. However, existing approaches to differentiable simulation have largely tackled scenarios where obtaining smooth gradients has been relatively easy, such as systems with mostly smooth dynamics. In this work, we study the challenges that differentiable simulation presents when it is not feasible to expect that a single descent reaches a global optimum, which is often a problem in contact-rich scenarios. We analyze the optimization landscapes of diverse scenarios that contain both rigid bodies and deformable objects. In dynamic environments with highly deformable objects and fluids, differentiable simulators produce rugged landscapes with nonetheless useful gradients in some parts of the space. We propose a method that combines Bayesian optimization with semi-local 'leaps' to obtain a global search method that can use gradients effectively, while also maintaining robust performance in regions with noisy gradients. We show that our approach outperforms several gradient-based and gradient-free baselines on an extensive set of experiments in simulation, and also validate the method using experiments with a real robot and deformables. Videos and supplementary materials are available at https://tinyurl.com/globdiff
翻译:可区别的模拟是快速梯度政策优化和系统识别的一个很有希望的工具包。然而,现有不同模拟方法在很大程度上解决了光滑梯度相对容易获得的情景,例如动态性能最顺的系统。在这项工作中,我们研究不同模拟所带来的挑战,如果无法预期单下降达到全球最佳程度,则不同模拟所带来的挑战,而这往往是接触率高的情景中的一个问题。我们分析了包含僵硬体体和变形物体的各种情景的优化景观。在具有高度变形物体和液体的动态环境中,不同模拟器生成了空间某些部分尽管有用梯度却仍然有用的崎岖不平的景色。我们提出了一种方法,将巴伊西亚优化与半局部的“梯度”相结合,以获得一种能够有效使用梯度的全球搜索方法,同时在具有噪音的梯度的地区保持强劲的性能。我们发现,我们的方法在模拟中比多个梯度基和梯度不易变的基线要强。我们还利用实际机器人和变形性梯度的实验验证了该方法。视频和补充材料可在 http://glovs.