Creating realistic characters that can react to the users' or another character's movement can benefit computer graphics, games and virtual reality hugely. However, synthesizing such reactive motions in human-human interactions is a challenging task due to the many different ways two humans can interact. While there are a number of successful researches in adapting the generative adversarial network (GAN) in synthesizing single human actions, there are very few on modelling human-human interactions. In this paper, we propose a semi-supervised GAN system that synthesizes the reactive motion of a character given the active motion from another character. Our key insights are two-fold. First, to effectively encode the complicated spatial-temporal information of a human motion, we empower the generator with a part-based long short-term memory (LSTM) module, such that the temporal movement of different limbs can be effectively modelled. We further include an attention module such that the temporal significance of the interaction can be learned, which enhances the temporal alignment of the active-reactive motion pair. Second, as the reactive motion of different types of interactions can be significantly different, we introduce a discriminator that not only tells if the generated movement is realistic or not, but also tells the class label of the interaction. This allows the use of such labels in supervising the training of the generator. We experiment with the SBU and the HHOI datasets. The high quality of the synthetic motion demonstrates the effective design of our generator, and the discriminability of the synthesis also demonstrates the strength of our discriminator.
翻译:创建现实的字符可以对用户的动作或其它字符的动作作出反应,从而大大有利于计算机的图形、游戏和虚拟现实。然而,将人类互动中的这种反应性动作合成为半受监督的GAN系统可以极大地有利于计算机的图形、游戏和虚拟现实。然而,由于人类互动的多种不同方式,合成人类互动中的这种反应性动作是一项具有挑战性的任务。虽然在调整基因对抗网络(GAN)以合成单一人类行动方面有许多成功的研究,但在模拟人类互动方面却很少出现。在本文中,我们建议建立一个半受监督的GAN系统,以合成一个字符的动态的动态反应性动作。我们的关键洞察力是两重的。首先,为了有效地编码复杂的人类运动的空间-时间信息,我们用一个基于部分的短期内存(LSTM)模块赋予生成者权力,这样可以有效地模拟不同肢体的时空运动。我们还包括一个关注模块,这样就可以了解互动的时空意义,这可以加强动态运动的时空调。第二,我们的关键见解是两重。首先,我们的关键洞洞洞洞洞洞洞察力是两种相互作用的动作的动作的动作的动态。首先要有效地解算,如果我们的动作的动作, 也可以感判判判判,我们能显示这个等级的感,如果我们使用高调,那么,那么,那么,那么,我们能的标签的感判判。