用于操作行动识别和预测的动动图自动编码器 (A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction)

Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human manipulation actions have their roots in the applications related to, for instance, human-robot interaction and robot learning from demonstration. The current research trend heavily relies on advanced convolutional neural networks to process the structured Euclidean data, such as RGB camera images. These networks, however, come with immense computational complexity to be able to process high dimensional raw data. Different from the related works, we here introduce a deep graph autoencoder to jointly learn recognition and prediction of manipulation tasks from symbolic scene graphs, instead of relying on the structured Euclidean data. Our network has a variational autoencoder structure with two branches: one for identifying the input graph type and one for predicting the future graphs. The input of the proposed network is a set of semantic graphs which store the spatial relations between subjects and objects in the scene. The network output is a label set representing the detected and predicted class types. We benchmark our new model against different state-of-the-art methods on two different datasets, MANIAC and MSRC-9, and show that our proposed model can achieve better performance. We also release our source code https://github.com/gamzeakyol/GNet.

翻译：尽管进行了数十年的研究,了解人类操纵活动是而且一直是计算机视觉和机器人中最有吸引力和最具挑战性的研究课题之一。观察到的人类操纵行动的识别和预测,其根源在于与人类-机器人互动和从演示中学习机器人有关的应用。目前的研究趋势严重依赖先进的超演神经神经网络处理结构化的Euclidean数据,如 RGB 相机图像。然而,这些网络具有巨大的计算复杂性,能够处理高维的原始数据。与相关作品不同,我们在此引入了一个深图自动编码器,以共同从象征性场景图中学习对操作任务的识别和预测,而不是依赖结构化的Euclidean数据。我们的网络有一个变式自动编码结构,有两个分支:一个是确定输入图形类型,一个是预测未来图表。拟议网络的输入是一套包含主题和对象之间空间关系的语系图。网络输出是一个标签组,代表所检测到和预测的类别模型类型。我们网络的标签组,而不是依赖结构化的Euclideencoder。我们网络有一个新的模型,用来根据不同的州-MS-ARC显示不同的数据源。我们的新模型,也可以展示了我们的新模型和不同的版本。我们的新模型,可以展示。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日