We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i.e., unarticulated) 3D model. Our network learns to predict the object's pose, part segmentation, and corresponding motion parameters to reproduce the articulation shown in the input image. The network is composed of three distinct branches that take a shared joint image-shape embedding and is trained end-to-end. Unlike previous methods, our approach is independent of the topology of the object and can work with objects from arbitrary categories. Our method, trained with only synthetic data, can be used to automatically animate a mesh, infer motion from real images, and transfer articulation to functionally similar but geometrically distinct 3D models at test time.
翻译:我们提出了一个神经网络方法,将运动从一个立体物体的单一图像转换到一个休息状态(即未说明的) 3D 模型。我们的网络学会预测物体的外形、部分分割和相应的运动参数,以复制输入图像中显示的表达。网络由三个不同的分支组成,这些分支采用共同的图像形状嵌入,经过培训,端到端。与以往的方法不同,我们的方法独立于物体的地形学,可以对来自任意类别的物体工作。我们的方法,仅受过合成数据的培训,可以用来自动固定网格,从真实图像中推断动态,并在测试时将镜像转换为功能相似但地理上截然不同的3D模型。