通过重建进行构成性代表性学习:调查 (Compositional Scene Representation Learning via Reconstruction: A Survey)

Visual scene representation learning is an important research problem in the field of computer vision. The performance on vision tasks could be improved if more suitable representations are learned for visual scenes. Complex visual scenes are the composition of relatively simple visual concepts, and have the property of combinatorial explosion. Compared with directly representing the entire visual scene, extracting compositional scene representations can better cope with the diverse combination of background and objects. Because compositional scene representations abstract the concept of objects, performing visual scene analysis and understanding based on these representations could be easier and more interpretable. Moreover, learning compositional scene representations via reconstruction can greatly reduce the need for training data annotations. Therefore, compositional scene representation learning via reconstruction has important research significance. In this survey, we first discuss representative methods that either learn from a single viewpoint or multiple viewpoints without object-level supervision, then the applications of compositional scene representations, and finally the future directions on this topic.

翻译：视觉场面表现学习是计算机视觉领域的一个重要研究问题。如果为视觉场面学习更合适的表现,视觉任务的表现是可以改进的。复杂的视觉场面是相对简单的视觉概念的构成,具有组合式爆炸的特性。与直接代表整个视觉场面相比,提取构成场面表现可以更好地应对背景和物体的不同组合。由于构成场面表现抽象了物体的概念,根据这些表现进行视觉场面分析和理解可以更容易和更容易解释。此外,通过重建学习构成场面表现可以大大减少培训数据说明的需要。因此,通过重建学习构成场面表现具有重要的研究意义。在这次调查中,我们首先讨论代表方法,要么从单一角度学习,要么在没有目标层面监督的情况下从多个角度学习,然后是组合场面表现的应用,最后是这一专题的未来方向。

相关内容

表示学习

关注 186

表示学习是通过利用训练数据来学习得到向量表示，这可以克服人工方法的局限性。表示学习通常可分为两大类，无监督和有监督表示学习。大多数无监督表示学习方法利用自动编码器（如去噪自动编码器和稀疏自动编码器等）中的隐变量作为表示。目前出现的变分自动编码器能够更好的容忍噪声和异常值。然而，推断给定数据的潜在结构几乎是不可能的。目前有一些近似推断的策略。此外，一些无监督表示学习方法旨在近似某种特定的相似性度量。提出了一种无监督的相似性保持表示学习框架，该框架使用矩阵分解来保持成对的DTW相似性。通过学习保持DTW的shaplets，即在转换后的空间中的欧式距离近似原始数据的真实DTW距离。有监督表示学习方法可以利用数据的标签信息，更好地捕获数据的语义结构。孪生网络和三元组网络是目前两种比较流行的模型，它们的目标是最大化类别之间的距离并最小化了类别内部的距离。

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

专知会员服务

52+阅读 · 2020年4月15日