Object location priors have been shown to be critical for the standard 6D object pose estimation setting, where the training and testing objects are the same. Specifically, they can be used to initialize the 3D object translation and facilitate 3D object rotation estimation. Unfortunately, the object detectors that are used for this purpose do not generalize to unseen objects, i.e., objects from new categories at test time. Therefore, existing 6D pose estimation methods for previously-unseen objects either assume the ground-truth object location to be known, or yield inaccurate results when it is unavailable. In this paper, we address this problem by developing a method, LocPoseNet, able to robustly learn location prior for unseen objects. Our method builds upon a template matching strategy, where we propose to distribute the reference kernels and convolve them with a query to efficiently compute multi-scale correlations. We then introduce a novel translation estimator, which decouples scale-aware and scale-robust features to predict different object location parameters. Our method outperforms existing works by a large margin on LINEMOD and GenMOP. We further construct a challenging synthetic dataset, which allows us to highlight the better robustness of our method to various noise sources.
翻译:对标准的 6D 对象构成估计设置来说, 培训和测试对象是相同的, 显示对象位置之前是关键。 具体地说, 它们可用于初始化 3D 对象翻译, 并为 3D 对象旋转估计提供便利 。 不幸的是, 用于此目的的物体探测器没有向看不见对象( 即测试时新类别中的物体) 进行概括化。 因此, 现有的 6D 为先前看不见的物体提出了估计方法, 要么假设已知的地面真实对象位置, 要么在无法提供时产生不准确的结果 。 在本文中, 我们通过开发一种方法( LocPoseNet) 来解决这个问题, 能够在未知对象之前强有力地学习位置 。 我们的方法建立在一个模板匹配战略上, 我们提议在其中分发参考内核, 并用一个查询来有效配置多尺度相关关系。 我们然后引入一个新的翻译估计器, 以解析出比例尺( 比例) 和 比例( robust) 特性来预测不同的对象位置参数。 我们的方法比我们现有的大边距( LINEMD ) 和 GenMO 的合成方法更强度( ) 。</s>