Action recognition is a fundamental capability for humanoid robots to interact and cooperate with humans. This application requires the action recognition system to be designed so that new actions can be easily added, while unknown actions are identified and ignored. In recent years, deep-learning approaches represented the principal solution to the Action Recognition problem. However, most models often require a large dataset of manually-labeled samples. In this work we target One-Shot deep-learning models, because they can deal with just a single instance for class. Unfortunately, One-Shot models assume that, at inference time, the action to recognize falls into the support set and they fail when the action lies outside the support set. Few-Shot Open-Set Recognition (FSOSR) solutions attempt to address that flaw, but current solutions consider only static images and not sequences of images. Static images remain insufficient to discriminate actions such as sitting-down and standing-up. In this paper we propose a novel model that addresses the FSOSR problem with a One-Shot model that is augmented with a discriminator that rejects unknown actions. This model is useful for applications in humanoid robotics, because it allows to easily add new classes and determine whether an input sequence is among the ones that are known to the system. We show how to train the whole model in an end-to-end fashion and we perform quantitative and qualitative analyses. Finally, we provide real-world examples.
翻译:行动识别是人类机器人与人类互动和合作的基本能力。 此应用程序要求设计行动识别系统, 以便可以轻松地增加新的行动, 同时识别和忽略未知的行动。 近年来, 深学习方法代表了行动识别问题的主要解决方案。 然而, 大多数模型往往需要大量人工标签样本的数据集。 在此工作中, 我们锁定单点深度学习模型, 因为它们可以处理一个类例子。 不幸的是, 单点模型假定, 在推断时, 识别行动会落到支持组中, 当行动处于支持组之外时, 行动识别行动会失败。 很少的热点开放识别( FSOSR) 解决方案试图解决这一缺陷, 但当前解决方案只考虑静态图像而不是图像序列。 静态图像仍然不足以歧视诸如静坐和立立式等行动。 在本文中, 一个新模式将FSOSR问题与一个单点模型联系起来, 并添加一个拒绝未知行动的歧视器。 这个模型对于人类的系统最终的应用程序来说非常有用, 因为它能够显示一个最终的序列, 我们能够显示一个已知的序列。