Procedure learning involves identifying the key-steps and determining their logical order to perform a task. Existing approaches commonly use third-person videos for learning the procedure, making the manipulated object small in appearance and often occluded by the actor, leading to significant errors. In contrast, we observe that videos obtained from first-person (egocentric) wearable cameras provide an unobstructed and clear view of the action. However, procedure learning from egocentric videos is challenging because (a) the camera view undergoes extreme changes due to the wearer's head motion, and (b) the presence of unrelated frames due to the unconstrained nature of the videos. Due to this, current state-of-the-art methods' assumptions that the actions occur at approximately the same time and are of the same duration, do not hold. Instead, we propose to use the signal provided by the temporal correspondences between key-steps across videos. To this end, we present a novel self-supervised Correspond and Cut (CnC) framework for procedure learning. CnC identifies and utilizes the temporal correspondences between the key-steps across multiple videos to learn the procedure. Our experiments show that CnC outperforms the state-of-the-art on the benchmark ProceL and CrossTask datasets by 5.2% and 6.3%, respectively. Furthermore, for procedure learning using egocentric videos, we propose the EgoProceL dataset consisting of 62 hours of videos captured by 130 subjects performing 16 tasks. The source code and the dataset are available on the project page https://sid2697.github.io/egoprocel/.
翻译:程序学习涉及确定关键步骤和确定执行任务的逻辑顺序。 现有方法通常使用第三人视频来学习程序, 使受操纵的对象在外观上显得小, 往往被演员所隐蔽, 导致重大错误。 相反, 我们观察到, 从第一人( 偏心) 穿戴式相机获得的视频提供了一种不受阻碍和清晰的动作视图。 然而, 从以自我为中心的视频学习程序具有挑战性, 因为:(a) 相机视图由于磨损器头部运动而发生极端变化, 以及 (b) 由于视频的不控制性质而存在不相关的框架。 由于此, 当前最先进的方法假设, 动作大约在同一时间发生, 并且持续相同的错误。 相反, 我们提议使用视频中关键节( 偏重) 之间的时间对应信号。 为此, 我们为程序学习提供了一个新颖的自我监督的Correponperd和Cutreau( Cnc) 框架。 CnC 确定并使用关键节段之间的时间对应信函, 运行多段的 Cropecial L 的 C- creal 程序。 我们的C- 学习 Crostiew 和 Crostiew 正在 学习 Cal 的 Crostiew 。 学习 C- cal 。 。 通过 学习 C- cal 的 C- cal 学习 Cal 的 Cal- cal- cal