Background modeling is a promising research area in video analysis with a variety of video surveillance applications. Recent years have witnessed the proliferation of deep neural networks via effective learning-based approaches in motion analysis. However, these techniques only provide a limited description of the observed scenes' insufficient properties where a single-valued mapping is learned to approximate the temporal conditional averages of the target background. On the other hand, statistical learning in imagery domains has become one of the most prevalent approaches with high adaptation to dynamic context transformation, notably Gaussian Mixture Models, combined with a foreground extraction step. In this work, we propose a novel, two-stage method of change detection with two convolutional neural networks. The first architecture is grounded on the unsupervised Gaussian mixtures statistical learning to describe the scenes' salient features. The second one implements a light-weight pipeline of foreground detection. Our two-stage framework contains approximately 3.5K parameters in total but still maintains rapid convergence to intricate motion patterns. Our experiments on publicly available datasets show that our proposed networks are not only capable of generalizing regions of moving objects in unseen cases with promising results but also are competitive in performance efficiency and effectiveness regarding foreground segmentation.
翻译:近些年来,通过在运动分析中采用有效的基于学习的方法,对深神经网络进行了扩散。然而,这些技术仅对所观测的场景的不足特性作了有限的描述,即学习了单一价值的绘图,以近似目标背景的有条件时间平均数。另一方面,图像领域的统计学习已成为对动态环境变化高度适应的最普遍的方法之一,特别是高山混合模型,加上地表提取步骤。在这项工作中,我们提议了一种新型的、两阶段的改变探测方法,与两个同步神经网络相结合。第一种建筑以未经监督的高斯混合物统计学学习为基础,以描述场景的特征。第二种建筑采用轻量的地面探测管道。我们的两阶段框架总体包含大约3.5K参数,但仍然与复杂的运动模式保持快速趋同。我们在公开提供的数据集上的实验表明,我们提议的网络不仅能够将可预见到的物体移动区域加以概括化,而且具有有可预见的效果,而且在性能和效果方面具有竞争力。