The connection between visual input and tactile sensing is critical for object manipulation tasks such as grasping and pushing. In this work, we introduce the challenging task of estimating a set of tactile physical properties from visual information. We aim to build a model that learns the complex mapping between visual information and tactile physical properties. We construct a first of its kind image-tactile dataset with over 400 multiview image sequences and the corresponding tactile properties. A total of fifteen tactile physical properties across categories including friction, compliance, adhesion, texture, and thermal conductance are measured and then estimated by our models. We develop a cross-modal framework comprised of an adversarial objective and a novel visuo-tactile joint classification loss. Additionally, we develop a neural architecture search framework capable of selecting optimal combinations of viewing angles for estimating a given physical property.
翻译:视觉输入和触觉感测之间的联系对于物体操纵任务(如抓取和推动)至关重要。 在这项工作中,我们引入了从视觉信息中估计一组触摸物理特性的艰巨任务。 我们的目标是建立一个模型,在视觉信息和触觉物理特性之间学习复杂的绘图。 我们构建了第一个图像触觉数据集,该数据集包含400多个多视图图像序列和相应的触觉特性。 总共对不同类别(包括摩擦、合规、粘合、纹理和热导)的共15个触觉物理特性进行了测量,然后由我们的模型来估计。 我们开发了一个由对抗目标和新型的粘触觉联合分类损失组成的跨模式框架。 此外,我们开发了一个神经结构搜索框架,能够选择最佳的观察角度组合来估计给定的物理属性。