We present AROS, a one-shot learning approach that uses an explicit representation of interactions between highly-articulated human poses and 3D scenes. The approach is one-shot as the method does not require re-training to add new affordance instances. Furthermore, only one or a small handful of examples of the target pose are needed to describe the interaction. Given a 3D mesh of a previously unseen scene, we can predict affordance locations that support the interactions and generate corresponding articulated 3D human bodies around them. We evaluate on three public datasets of scans of real environments with varied degrees of noise. Via rigorous statistical analysis of crowdsourced evaluations, results show that our one-shot approach outperforms data-intensive baselines by up to 80\%.
翻译:我们展示了ARSS, 这是一种一次性的学习方法,它使用一种清晰的描述高度分辨的人的姿势和3D场景之间相互作用的图像。 这种方法是一拍的, 因为该方法不需要再培训来增加新的发价实例。 此外, 只需要一个或一小部分目标的外形例子来描述互动。 鉴于一个先前看不见的场景的三维网格, 我们可以预测支持这些相互作用的发价地点, 并围绕它们生成相应的3D形人体体。 我们评估了三个具有不同程度噪音的真实环境扫描的公开数据集。 通过对多方源评价的严格统计分析, 结果表明,我们一发式方法比数据密集基线高出80个。