Object models are gradually progressing from predicting just category labels to providing detailed descriptions of object instances. This motivates the need for large datasets which go beyond traditional object masks and provide richer annotations such as part masks and attributes. Hence, we introduce PACO: Parts and Attributes of Common Objects. It spans 75 object categories, 456 object-part categories and 55 attributes across image (LVIS) and video (Ego4D) datasets. We provide 641K part masks annotated across 260K object boxes, with roughly half of them exhaustively annotated with attributes as well. We design evaluation metrics and provide benchmark results for three tasks on the dataset: part mask segmentation, object and part attribute prediction and zero-shot instance detection. Dataset, models, and code are open-sourced at https://github.com/facebookresearch/paco.
翻译:物体模型正在逐步取得进展,从预测仅仅分类标签到提供详细描述物体实例。这促使需要大型数据集,这些数据集超越传统的物体遮罩,提供更丰富的说明,例如部分遮罩和属性。因此,我们引入了PCO:共同对象的部件和属性。它覆盖75个对象类别、456个对象部分类别和图像(LVIS)和视频(Ego4D)数据集的55个属性。我们提供在260K对象框上附加说明的641K部分遮罩,其中约一半附有详尽说明的属性。我们设计了评价指标,并为数据集的三项任务提供了基准结果:部分遮罩分割、对象和部分属性预测以及零光实例探测。数据集、模型和代码在https://github.com/facereesearch/paco上是开源的。