Even after decades of research, dynamic scene background reconstruction and foreground object segmentation are still considered as open problems due various challenges such as illumination changes, camera movements, or background noise caused by air turbulence or moving trees. We propose in this paper to model the background of a frame sequence as a low dimensional manifold using an autoencoder and compare the reconstructed background provided by this autoencoder with the original image to compute the foreground/background segmentation masks. The main novelty of the proposed model is that the autoencoder is also trained to predict the background noise, which allows to compute for each frame a pixel-dependent threshold to perform the foreground segmentation. Although the proposed model does not use any temporal or motion information, it exceeds the state of the art for unsupervised background subtraction on the CDnet 2014 and LASIESTA datasets, with a significant improvement on videos where the camera is moving. It is also able to perform background reconstruction on some non-video image datasets.
翻译:即使在经过数十年的研究之后,动态现场背景的重建和前景物体分割仍被视为开放问题,因为面临各种挑战,如照明变化、相机移动、或由气流或移动树木引起的背景噪音等。我们在本文件中提议,使用自动编码器将框架序列的背景建为低维元元件,并将该自动编码器提供的重建背景与原始图像进行比较,以计算前地/后地隔断面。拟议模型的主要新颖之处是,自动编码器还受过培训,以预测背景噪音,从而可以对每个框架进行像素依赖的阈值的计算,以进行前地分隔。虽然拟议的模型不使用任何时间或运动信息,但它超过了CDnet 2014 和 LASIESTA 数据集的不超常背景减法,同时对相机移动的视频作了重大改进。它还能够对某些非视频数据集进行背景重建。