Creating computer vision datasets requires careful planning and lots of time and effort. In robotics research, we often have to use standardized objects, such as the YCB object set, for tasks such as object tracking, pose estimation, grasping and manipulation, as there are datasets and pre-learned methods available for these objects. This limits the impact of our research since learning-based computer vision methods can only be used in scenarios that are supported by existing datasets. In this work, we present a full object keypoint tracking toolkit, encompassing the entire process from data collection, labeling, model learning and evaluation. We present a semi-automatic way of collecting and labeling datasets using a wrist mounted camera on a standard robotic arm. Using our toolkit and method, we are able to obtain a working 3D object keypoint detector and go through the whole process of data collection, annotation and learning in just a couple hours of active time.
翻译:创建计算机视觉数据集需要仔细规划和花费大量时间和精力。在机器人研究中,我们常常不得不使用标准化物体,如YCB天体集,进行物体跟踪、显示估计、掌握和操纵等任务,因为有可供这些天体使用的数据集和预学方法。这限制了我们研究的影响,因为基于学习的计算机视觉方法只能在现有数据集所支持的情景中使用。在这项工作中,我们提出了一个完整的物体关键点跟踪工具包,包括数据收集、标签、模型学习和评价等整个过程。我们提出了一个半自动方法,用标准机器人臂上的手腕架相机来收集和标签数据集。我们使用工具包和方法,能够获得一个工作3D天体物体关键点探测器,在仅仅几个小时的活跃时间里完成数据收集、注解和学习的整个过程。