In this work we introduce a fully-connected graph structure in the Deep Gaussian Conditional Random Field (G-CRF) model. For this we express the pairwise interactions between pixels as the inner-products of low-dimensional embeddings, delivered by a new subnetwork of a deep architecture. We efficiently minimize the resulting energy by solving the resulting low-rank linear system with conjugate gradients, and derive an analytic expression for the gradient of our embeddings which allows us to train them end-to-end with backpropagation. We demonstrate the merit of our approach by achieving state of the art results on three challenging Computer Vision benchmarks, namely semantic segmentation, human parts segmentation, and saliency estimation. Our implementation is fully GPU based, built on top of the Caffe library, and will be made publicly available.
翻译:在这项工作中,我们在深海高斯条件随机字段(G-CRF)模型中引入了完全连接的图形结构。 为此,我们展示了像素作为低维嵌入的内产物之间的双向互动,由新的深层结构子网络提供。我们通过解决由此形成的低级线性系统与相交梯度,有效地将由此产生的能量最小化,并为我们嵌入的梯度产生分析性表达方式,从而使我们能够用反向分析来训练它们端对端。我们通过在三种具有挑战性的计算机愿景基准(即语义分解、人体部分分解和突出的估算)上取得最新结果,展示了我们的方法的优点。我们的实施是完全基于伽夫图书馆顶部的GPU,并将予以公布。