The ability to associate touch with sight is essential for tasks that require physically interacting with objects in the world. We propose a dataset with paired visual and tactile data called Touch and Go, in which human data collectors probe objects in natural environments using tactile sensors, while simultaneously recording egocentric video. In contrast to previous efforts, which have largely been confined to lab settings or simulated environments, our dataset spans a large number of "in the wild" objects and scenes. To demonstrate our dataset's effectiveness, we successfully apply it to a variety of tasks: 1) self-supervised visuo-tactile feature learning, 2) tactile-driven image stylization, i.e., making the visual appearance of an object more consistent with a given tactile signal, and 3) predicting future frames of a tactile signal from visuo-tactile inputs.
翻译:将触摸与视觉联系起来的能力对于需要与世界天体进行物理互动的任务至关重要。 我们提出一个数据集,配对视觉和触觉数据,称为触摸和触觉数据,其中人类数据收集员利用触觉传感器在自然环境中探测物体,同时记录以自我为中心的视频。与以往的努力相比,我们的数据集覆盖了大量“野生”天体和场景。为了展示我们的数据集的有效性,我们成功地将它应用到各种任务中:1) 自我监督的触觉特性学习,2)触觉驱动图像闪烁,即使物体的视觉外观与给定的触动信号更加一致,3)预测从反触觉输入产生的触动信号的未来框架。