Artificial intelligence on graphs (graph AI) has achieved remarkable success in modeling complex systems, ranging from dynamical systems in biology to interacting particle systems in physics. The increasingly heterogeneous graph datasets call for multimodal graph AI algorithms to combine multiple inductive biases -- the set of assumptions that algorithms use to predict outputs of given inputs that they have not yet encountered. Learning on multimodal graph datasets presents fundamental challenges because inductive biases can vary by data modality and graphs might not be explicitly given in the input. To address these challenges, multimodal graph AI methods combine multiple modalities while leveraging cross-modal dependencies. Here, we survey 142 studies in graph AI and realize that diverse datasets are increasingly combined using graphs and fed into sophisticated multimodal models. These models stratify into image-, language-, and knowledge-grounded multimodal graph AI methods. Using this categorization of state-of-the-art methods, we put forward an algorithmic blueprint for multimodal graph AI, which we use to study existing methods and standardize the design of future methods for highly complex systems.
翻译:图表(AI)的人工智能在从生物学动态系统到物理中相互作用粒子系统等复杂系统的建模方面取得了显著的成功。日益多样化的图表数据集要求多式图表AI算法结合多重感化偏差 -- -- 算法用来预测它们尚未遇到的某一投入的产出的一套假设。关于多式图表数据集的学习提出了基本挑战,因为暗示偏差可能因数据模式和图表的不同而不同。为了应对这些挑战,多式图表AI方法结合了多种模式,同时利用跨模式依赖性。在这里,我们在图AI中调查了142项研究,并意识到多种数据集越来越多地使用图表进行合并,并被输入复杂的多式联运模型。这些模型分化为图像、语言和知识基础的多式图表AI方法。我们使用这种对最新方法的分类,为多式图表AI提出了一个算法蓝图,我们用来研究现有方法,并标准化今后设计高度复杂的系统的方法。