With advanced imaging, sequencing, and profiling technologies, multiple omics data become increasingly available and hold promises for many healthcare applications such as cancer diagnosis and treatment. Multimodal learning for integrative multi-omics analysis can help researchers and practitioners gain deep insights into human diseases and improve clinical decisions. However, several challenges are hindering the development in this area, including the availability of easily accessible open-source tools. This survey aims to provide an up-to-date overview of the data challenges, fusion approaches, datasets, and software tools from several new perspectives. We identify and investigate various omics data challenges that can help us understand the field better. We categorize fusion approaches comprehensively to cover existing methods in this area. We collect existing open-source tools to facilitate their broader utilization and development. We explore a broad range of omics data modalities and a list of accessible datasets. Finally, we summarize future directions that can potentially address existing gaps and answer the pressing need to advance multimodal learning for multi-omics data analysis.
翻译:由于先进的成像、测序和剖析技术,多种动脉数据越来越容易获得,对癌症诊断和治疗等许多保健应用都有希望。综合多组分析的多模式学习有助于研究人员和从业者深入了解人类疾病并改进临床决策。然而,若干挑战正在阻碍这一领域的发展,包括提供容易获得的开放源码工具。这项调查旨在从几个新角度对数据挑战、聚合方法、数据集和软件工具提供最新概览。我们查明并调查各种有助于我们更好地了解该领域的数据挑战。我们全面分类聚合方法以涵盖这一领域的现有方法。我们收集现有的开放源码工具以促进其更广泛的利用和发展。我们探索了多种开放源码数据模式和可获取数据集清单。最后,我们总结了可能解决现有差距的未来方向,并满足了推动多组群数据分析多模式学习的迫切需要。