We propose Hi4D, a method and dataset for the automatic analysis of physically close human-human interaction under prolonged contact. Robustly disentangling several in-contact subjects is a challenging task due to occlusions and complex shapes. Hence, existing multi-view systems typically fuse 3D surfaces of close subjects into a single, connected mesh. To address this issue we leverage i) individually fitted neural implicit avatars; ii) an alternating optimization scheme that refines pose and surface through periods of close proximity; and iii) thus segment the fused raw scans into individual instances. From these instances we compile Hi4D dataset of 4D textured scans of 20 subject pairs, 100 sequences, and a total of more than 11K frames. Hi4D contains rich interaction-centric annotations in 2D and 3D alongside accurately registered parametric body models. We define varied human pose and shape estimation tasks on this dataset and provide results from state-of-the-art methods on these benchmarks.
翻译:我们提出Hi4D,一种用于自动分析长时间的物理近距离人际互动的方法和数据集。由于遮挡和复杂形状,强大地解开几个互相接触的对象是一项具有挑战性的任务。因此,现有的多视图系统通常将靠近的几个对象的3D表面融合成一个连接的网格。为解决这个问题,我们利用了:i)逐个拟合的神经隐式阿凡达人;ii)通过密切接触的周期交替优化方案,从而改进姿态和表面;以及iii)因此将融合的原始扫描分割成单独的实例。从这些实例中,我们编译了Hi4D数据集,包括20对主体的4D纹理扫描,100个序列,总共超过11K个帧。 Hi4D除了精确注册的参数化身体模型之外,还包含2D和3D中丰富的互动中心注释。我们在这个数据集上定义了各种人类姿势和形状估计任务,并提供了这些基准测试上最先进方法的结果。