Offline planning often struggles with poor sampling efficiency as it tries to learn policies from scratch. Especially with diffusion models, such cold start practices mean that both training and sampling become very expensive. We hypothesize that certain environment constraint priors or cheaply available policies make it unnecessary to learn from scratch, and explore a way to incorporate such priors in the learning process. To achieve that, we borrow a variation of the Schr\"odinger bridge formulation from the image-to-image setting and apply it to planning tasks. We study the performance on some planning tasks and compare the performance against the DDPM formulation. The code for this work is available at https://github.com/adrshsrvstv/bridge_diffusion_planning.
翻译:暂无翻译