As a unique and promising biometric, video-based gait recognition has broad applications. The key step of this methodology is to learn the walking pattern of individuals, which, however, often suffers challenges to extract the behavioral feature from a sequence directly. Most existing methods just focus on either the appearance or the motion pattern. To overcome these limitations, we propose a sequential convolutional network (SCN) from a novel perspective, where spatiotemporal features can be learned by a basic convolutional backbone. In SCN, behavioral information extractors (BIE) are constructed to comprehend intermediate feature maps in time series through motion templates where the relationship between frames can be analyzed, thereby distilling the information of the walking pattern. Furthermore, a multi-frame aggregator in SCN performs feature integration on a sequence whose length is uncertain, via a mobile 3D convolutional layer. To demonstrate the effectiveness, experiments have been conducted on two popular public benchmarks, CASIA-B and OU-MVLP, and our approach is demonstrated superior performance, comparing with the state-of-art methods.
翻译:这种方法的关键步骤是学习个人行走模式,但这种模式往往在直接从一个序列中提取行为特征方面遇到困难。大多数现有方法只是侧重于外观或运动模式。为了克服这些限制,我们从新角度提出一个相继的演进网络(SCN),在这个网络中,可以通过一个基本的共变骨干来学习时空特征。在SCN中,行为信息提取器(BIE)通过可分析各框架之间关系的运动模板来理解时序的中间特征图,从而提取行走模式的信息。此外,SCN的多框架聚合器通过移动的3D共变层对长度不确定的序列进行特征整合。为了证明效果,对两个受欢迎的公共基准(CASIA-B和OU-MVLP)进行了实验,我们的方法与州方法相比表现优异。