Artificial intelligence for graphs (graph AI) has achieved remarkable success in modeling complex systems, ranging from dynamic networks in biology to interacting particle systems in physics. However, the increasingly heterogeneous graph datasets call for multimodal methods that can combine different inductive biases: the set of assumptions that algorithms use to make predictions for inputs they have not encountered during training. Learning on multimodal graph datasets presents fundamental challenges because the inductive biases can vary by data modality and graphs might not be explicitly given in the input. To address these challenges, multimodal graph AI methods combine different modalities while leveraging cross-modal dependencies. Here, we survey 145 studies in graph AI and realize that diverse datasets are increasingly combined using graphs and fed into sophisticated multimodal methods, specified as image-intensive, knowledge-grounded and language-intensive models. Using this categorization, we introduce a blueprint for multimodal graph AI to study existing methods and guide the design of future methods.
翻译:图表的人工智能(AI)在从生物学动态网络到物理学中相交粒子系统的复杂系统模型化方面取得了显著的成功,但是,日益多样化的图表数据集要求采用多式方法,这些方法可以结合不同的感应偏差:算法用来预测培训期间没有遇到的投入的一套假设。关于多式图表数据集的学习提出了基本挑战,因为从导论偏差可能因数据模式和图表的不同而在投入中可能没有明确说明。为了应对这些挑战,多式图表AI方法在利用跨模式依赖性的同时,将不同模式结合起来。在这里,我们在图AI中调查了145项研究,并意识到不同数据集越来越多地使用图表并被注入复杂的多式方法,具体为图像密集、知识基础和语言密集型模型。我们通过这一分类,提出了多式联运图表AI的蓝图,以研究现有方法并指导未来方法的设计。