Prediction of human actions in social interactions has important applications in the design of social robots or artificial avatars. In this paper, we model human interaction generation as a discrete multi-sequence generation problem and present SocialInteractionGAN, a novel adversarial architecture for conditional interaction generation. Our model builds on a recurrent encoder-decoder generator network and a dual-stream discriminator. This architecture allows the discriminator to jointly assess the realism of interactions and that of individual action sequences. Within each stream a recurrent network operating on short subsequences endows the output signal with local assessments, better guiding the forthcoming generation. Crucially, contextual information on interacting participants is shared among agents and reinjected in both the generation and the discriminator evaluation processes. We show that the proposed SocialInteractionGAN succeeds in producing high realism action sequences of interacting people, comparing favorably to a diversity of recurrent and convolutional discriminator baselines. Evaluations are conducted using modified Inception Score and Fr{\'e}chet Inception Distance metrics, that we specifically design for discrete sequential generated data. The distribution of generated sequences is shown to approach closely that of real data. In particular our model properly learns the dynamics of interaction sequences, while exploiting the full range of actions.
翻译:社会互动中人类行动的预测在设计社会机器人或人工动因方面有着重要的应用。 在本文中,我们将人类互动生成模型作为离散的多序列生成问题进行模拟,并提出社会互动GAN,这是用于有条件互动生成的新型对抗结构。我们的模型建立在反复出现的编码器脱coder生成网络和双流歧视器上。这个结构使歧视者能够联合评估互动和个人行动序列的现实性。在每个流中,一个在短序下运行的经常性网络将产出信号与地方评估联系起来,更好地指导下一代。关键是,互动参与者的背景资料在代理人之间共享,并在生成和区分评估过程中重新注入。我们表明,拟议的社会互动GAN成功地产生了互动人的高度现实主义行动序列,将这种序列与反复和变动的区别性歧视者基线多样性进行比较。在每流中,评价使用经修改的感知分数和Fr'echect Incepion距离测量仪进行,这是我们专门设计的离心相序列生成数据模型,我们具体设计用于离序列生成数据的模型。我们所生成的数据序列的序列的分布在正确进行互动的顺序上展示。