We propose a bootstrapping framework to enhance human optical flow and pose. We show that, for videos involving humans in scenes, we can improve both the optical flow and the pose estimation quality of humans by considering the two tasks at the same time. We enhance optical flow estimates by fine-tuning them to fit the human pose estimates and vice versa. In more detail, we optimize the pose and optical flow networks to, at inference time, agree with each other. We show that this results in state-of-the-art results on the Human 3.6M and 3D Poses in the Wild datasets, as well as a human-related subset of the Sintel dataset, both in terms of pose estimation accuracy and the optical flow accuracy at human joint locations. Code available at https://github.com/ubc-vision/bootstrapping-human-optical-flow-and-pose
翻译:我们提议了一个改进人类光学流和面貌的穿梭框架。我们表明,对于现场涉及人类的录像,通过同时考虑这两项任务,我们可以提高人类光学流动和面貌估计质量;我们通过微调光学流动估计值,以适应人类表面估计值,反之亦然,从而提高光学流动估计值;更详细地说,我们优化造型和光学流动网络,以在推论时间彼此一致;我们表明,这一结果在野生数据集中的人类3.6M和3DPoses的最新结果中,以及在Sintel数据集中与人类有关的一组数据中,无论是在构成估计准确性方面,还是在人类联合地点的光学流动准确性方面,都能够提高光学流动估计值。可在https://github.com/ubc-vision/bootstrapping-human-opical-plow-pose查阅的代码。