We consider the reconstruction problem of video compressive sensing (VCS) under the deep unfolding/rolling structure. Yet, we aim to build a flexible and concise model using minimum stages. Different from existing deep unfolding networks used for inverse problems, where more stages are used for higher performance but without flexibility to different masks and scales, hereby we show that a 2-stage deep unfolding network can lead to the state-of-the-art (SOTA) results (with a 1.7dB gain in PSNR over the single stage model, RevSCI) in VCS. The proposed method possesses the properties of adaptation to new masks and ready to scale to large data without any additional training thanks to the advantages of deep unfolding. Furthermore, we extend the proposed model for color VCS to perform joint reconstruction and demosaicing. Experimental results demonstrate that our 2-stage model has also achieved SOTA on color VCS reconstruction, leading to a >2.3dB gain in PSNR over the previous SOTA algorithm based on plug-and-play framework, meanwhile speeds up the reconstruction by >17 times. In addition, we have found that our network is also flexible to the mask modulation and scale size for color VCS reconstruction so that a single trained network can be applied to different hardware systems. The code and models will be released to the public.
翻译:我们考虑的是在深层发展/滚动结构下重建视频压缩感(VCS)的问题。然而,我们的目标是建立一个使用最起码阶段的灵活和简洁模型。不同于用于反向问题的现有深层网络,在反向问题中,使用更多的阶段来提高性能,但又不灵活地使用不同的面具和比例。我们在此表明,一个分为两个阶段的深层网络可以导致在VCS中取得最新水平(SOTA)的结果(PSNR在单一阶段模型(RevSCI)中获得1.7dB的收益)。拟议方法具有适应新面罩的特性,可以在不受到任何额外培训的情况下将大数据缩放。此外,我们扩展了拟议的彩色VCSS模型,以进行联合重建和演示。实验结果表明,我们的2阶段模型还可以在彩色VCS重建方面实现SOTA(SOTA),导致PSNR在前一个基于插播框架的SOTA算法中获得 > 2.3dB的收益,同时加快了重建速度 > 17次。此外,我们发现我们的网络可以灵活地应用一个不同格式的硬件系统。