We address the challenging problem of jointly inferring the 3D flow and volumetric densities moving in a fluid from a monocular input video with a deep neural network. Despite the complexity of this task, we show that it is possible to train the corresponding networks without requiring any 3D ground truth for training. In the absence of ground truth data we can train our model with observations from real-world capture setups instead of relying on synthetic reconstructions. We make this unsupervised training approach possible by first generating an initial prototype volume which is then moved and transported over time without the need for volumetric supervision. Our approach relies purely on image-based losses, an adversarial discriminator network, and regularization. Our method can estimate long-term sequences in a stable manner, while achieving closely matching targets for inputs such as rising smoke plumes.
翻译:我们解决了共同推算三维流动和体积密度的棘手问题,即从一个带有深神经网络的单视输入视频流体中流出一个流体。尽管这项任务十分复杂,但我们表明,在不需要任何三维地面真相培训的情况下,可以培训相应的网络。在缺乏地面真相数据的情况下,我们可以用真实世界捕获装置的观察而不是依靠合成重建来培训我们的模型。我们让这种不受监督的培训方法成为可能,首先生成一个最初的原型体积,然后在不需要体积监督的情况下随时间移动和移动。我们的方法完全依赖基于图像的损失、对抗性歧视者网络和正规化。我们的方法可以稳定地估计长期序列,同时实现诸如烟羽上升等投入的紧密匹配目标。</s>