Accurate motion and depth recovery is important for many robot vision tasks including autonomous driving. Most previous studies have achieved cooperative multi-task interaction via either pre-defined loss functions or cross-domain prediction. This paper presents a multi-task scheme that achieves mutual assistance by means of our Flow to Depth (F2D), Depth to Flow (D2F), and Exponential Moving Average (EMA). F2D and D2F mechanisms enable multi-scale information integration between optical flow and depth domain based on differentiable shallow nets. A dual-head mechanism is used to predict optical flow for rigid and non-rigid motion based on a divide-and-conquer manner, which significantly improves the optical flow estimation performance. Furthermore, to make the prediction more robust and stable, EMA is used for our multi-task training. Experimental results on KITTI datasets show that our multi-task scheme outperforms other multi-task schemes and provide marked improvements on the prediction results.
翻译:精确的动作和深度恢复对于许多机器人的视觉任务(包括自主驾驶)很重要。以前的大多数研究都通过预先界定的损失功能或跨域预测实现了多任务合作互动。本文件介绍了一个多任务计划,通过我们流向深度、深度到流动(D2F)和指数平均移动(EMA)实现相互援助。F2D和D2F机制使得光学流动和基于不同浅网的深度域之间能够实现多层次信息整合。一个双头机制被用来预测基于分而治之方式的硬性和非硬性运动的光学流动,这大大改善了光学流量估计绩效。此外,为了使预测更加有力和稳定,EMA被用于我们的多任务培训。KITTI数据集的实验结果显示,我们的多任务计划超越了其他多任务计划,并显著改进了预测结果。