Category-level object pose and shape estimation from a single depth image has recently drawn research attention due to its potential utility for tasks such as robotics manipulation. The task is particularly challenging because the three unknowns, object pose, object shape, and model-to-measurement correspondences, are compounded together, but only a single view of depth measurements is provided. Most of the prior work heavily relies on data-driven approaches to obtain solutions to at least one of the unknowns, and typically two, risking generalization failures if not designed and trained carefully. The shape representations used in the prior work also mainly focus on point clouds and signed distance fields (SDFs). In stark contrast to the prior work, we approach the problem using an iterative estimation method that does not require learning from pose-annotated data. Moreover, we construct and adopt a novel mesh-based object active shape model (ASM), which additionally maintains vertex connectivity compared to the commonly used point-based object ASM. Our algorithm, ShapeICP, is based on the iterative closest point (ICP) algorithm but is equipped with additional features for the category-level pose and shape estimation task. Although not using pose-annotated data, ShapeICP surpasses many data-driven approaches that rely on pose data for training, opening up a new solution space for researchers to consider.
翻译:从单张深度图像进行类别级物体姿态与形状估计,因其在机器人操作等任务中的潜在应用价值,近年来受到研究关注。该任务极具挑战性,因为物体姿态、物体形状以及模型到测量值的对应关系这三个未知量相互耦合,却仅提供单视角的深度测量数据。现有工作大多严重依赖数据驱动方法来求解至少一个(通常是两个)未知量,若未经精心设计和训练,存在泛化失败的风险。先前工作中使用的形状表示也主要集中于点云和有符号距离场。与先前工作形成鲜明对比的是,我们采用一种无需从姿态标注数据中学习的迭代估计方法来解决该问题。此外,我们构建并采用了一种新颖的基于网格的物体主动形状模型,与常用的基于点的物体主动形状模型相比,该模型额外保持了顶点连接性。我们的算法ShapeICP基于迭代最近点算法,但针对类别级姿态与形状估计任务配备了附加功能。尽管未使用姿态标注数据,ShapeICP的性能超越了众多依赖姿态数据进行训练的数据驱动方法,为研究者开辟了新的解决方案空间。