Recent approaches of computer vision utilize deep learning methods as they perform quite well if training and testing domains follow the same underlying data distribution. However, it has been shown that minor variations in the images that occur when using these methods in the real world can lead to unpredictable errors. Transfer learning is the area of machine learning that tries to prevent these errors. Especially, approaches that augment image data using auxiliary knowledge encoded in language embeddings or knowledge graphs (KGs) have achieved promising results in recent years. This survey focuses on visual transfer learning approaches using KGs. KGs can represent auxiliary knowledge either in an underlying graph-structured schema or in a vector-based knowledge graph embedding. Intending to enable the reader to solve visual transfer learning problems with the help of specific KG-DL configurations we start with a description of relevant modeling structures of a KG of various expressions, such as directed labeled graphs, hypergraphs, and hyper-relational graphs. We explain the notion of feature extractor, while specifically referring to visual and semantic features. We provide a broad overview of knowledge graph embedding methods and describe several joint training objectives suitable to combine them with high dimensional visual embeddings. The main section introduces four different categories on how a KG can be combined with a DL pipeline: 1) Knowledge Graph as a Reviewer; 2) Knowledge Graph as a Trainee; 3) Knowledge Graph as a Trainer; and 4) Knowledge Graph as a Peer. To help researchers find evaluation benchmarks, we provide an overview of generic KGs and a set of image processing datasets and benchmarks including various types of auxiliary knowledge. Last, we summarize related surveys and give an outlook about challenges and open issues for future research.
翻译:如果培训和测试域遵循相同的基本数据分布,则最近的计算机视觉方法将利用深层次学习方法,因为如果培训和测试域遵循同样的基本数据分布,这些方法的效果会相当好。然而,已经表明,在现实世界中使用这些方法时,图像中发生的微小变化可能导致不可预测的错误。转移学习是机器学习领域,试图防止这些错误。特别是,近年来,使用语言嵌入或知识图表(KGs)中编码的辅助知识来增加图像数据的方法取得了大有希望的结果。这项调查侧重于使用 KGs 的视觉传输学习方法。KGs 可以在基本图形结构的系统或基于矢量的高级类知识浏览中代表辅助知识。我们为读者提供了一种广义的知识图形嵌入式学习问题,在特定 KG- DL 配置下,我们首先用对各种表达方式的KGG 相关模型结构的描述,例如定向标签图形、超直径镜和超正关系图表。我们解释了各种地谱提取器的概念,同时具体提到了视觉和语义特征。我们提供了一种广泛的知识缩缩缩略图,我们提供了一个广泛的概览图图图的概略图的概略图。我们提供了一种了解和直观的图,然后将一些直观和直观的直观的直观的直观的直观的直径径径直径直径直图。