We propose an algorithm for real-time 6DOF pose tracking of rigid 3D objects using a monocular RGB camera. The key idea is to derive a region-based cost function using temporally consistent local color histograms. While such region-based cost functions are commonly optimized using first-order gradient descent techniques, we systematically derive a Gauss-Newton optimization scheme which gives rise to drastically faster convergence and highly accurate and robust tracking performance. We furthermore propose a novel complex dataset dedicated for the task of monocular object pose tracking and make it publicly available to the community. To our knowledge, It is the first to address the common and important scenario in which both the camera as well as the objects are moving simultaneously in cluttered scenes. In numerous experiments - including our own proposed data set - we demonstrate that the proposed Gauss-Newton approach outperforms existing approaches, in particular in the presence of cluttered backgrounds, heterogeneous objects and partial occlusions.
翻译:我们建议使用单镜 RGB 相机实时跟踪 6DOF 显示刻板的 3D 对象的算法。 关键的想法是使用时间一致的本地色直方图生成基于区域的成本功能。 虽然这种基于区域的成本功能通常使用一阶梯下移技术优化,但我们系统地生成高斯- 纽顿优化计划,该计划导致快速趋同和高度准确和稳健的跟踪性能。 我们还建议建立一个新的复杂数据集,专门用于单镜对象构成的跟踪任务,并向公众公布。 据我们所知,这是第一个在封闭的场景中同时移动相机和物体的常见和重要情景。 在无数实验中,我们证明拟议的高斯- 纽顿方法超越了现有方法,特别是在存在封闭的背景、混杂物体和部分隐蔽的情况下。