Hand segmentation and detection in truly unconstrained RGB-based settings is important for many applications. However, existing datasets are far from sufficient both in terms of size and variety due to the infeasibility of manual annotation of large amounts of segmentation and detection data. As a result, current methods are limited by many underlying assumptions such as constrained environment, consistent skin color and lighting. In this work, we present a large-scale RGB-based egocentric hand segmentation/detection dataset Ego2Hands that is automatically annotated and a color-invariant compositing-based data generation technique capable of creating unlimited training data with variety. For quantitative analysis, we manually annotated an evaluation set that significantly exceeds existing benchmarks in quantity, diversity and annotation accuracy. We provide cross-dataset evaluation as well as thorough analysis on the performance of state-of-the-art models on Ego2Hands to show that our dataset and data generation technique can produce models that generalize to unseen environments without domain adaptation.
翻译:在真正不受限制的 RGB 环境下的手分解和检测对于许多应用都很重要,但是,现有的数据集在大小和种类上都远远不够,因为人工说明大量分解和检测数据是不可行的,因此,目前的方法受到许多基本假设的限制,如环境受限、肤色和照明等。在这项工作中,我们提出了一个大型 RGB 的以自我为中心的手分解/探测数据集Ego2Hands自动附加说明和一种基于色异的复构数据生成技术,能够产生无限制的、多样的培训数据。关于定量分析,我们手动的一套评价说明大大超过现有数量、多样性和批注准确性基准。我们提供跨数据集评价以及对Ego2Hands 最新模型的性能进行透彻分析,以表明我们的数据集和数据生成技术可以生成模型,在没有领域适应的情况下将模型推广到看不见的环境。