Empirically observed time series in physics, biology, or medicine, are commonly generated by some underlying dynamical system (DS) which is the target of scientific interest. There is an increasing interest to harvest machine learning methods to reconstruct this latent DS in a completely data-driven, unsupervised way. In many areas of science it is common to sample time series observations from many data modalities simultaneously, e.g. electrophysiological and behavioral time series in a typical neuroscience experiment. However, current machine learning tools for reconstructing DSs usually focus on just one data modality. Here we propose a general framework for multi-modal data integration for the purpose of nonlinear DS identification and cross-modal prediction. This framework is based on dynamically interpretable recurrent neural networks as general approximators of nonlinear DSs, coupled to sets of modality-specific decoder models from the class of generalized linear models. Both an expectation-maximization and a variational inference algorithm for model training are advanced and compared. We show on nonlinear DS benchmarks that our algorithms can efficiently compensate for too noisy or missing information in one data channel by exploiting other channels, and demonstrate on experimental neuroscience data how the algorithm learns to link different data domains to the underlying dynamics
翻译:在物理学、生物学或医学中,一些具有科学兴趣的内在动态系统(DS)通常创造出时间序列,这是物理学、生物学或医学中一些动态系统(DS)的共同特点。人们越来越有兴趣收集机器学习方法,以完全以数据驱动、不受监督的方式重建这种潜伏的DS。在许多科学领域,通常可以同时从许多数据模式(例如电子生理学和行为时间序列)中抽取时间序列观测结果,在典型的神经科学实验中,这种数据模式是典型的。然而,当前用于重建DS的机器学习工具通常只侧重于一种数据模式。我们在这里提出了一个用于非线性DS识别和跨模式预测的多模式数据集成总框架。这个框架基于动态可解释的经常性神经网络,作为非线性DS系统的一般近似吸附器,结合一系列典型的典型线性模型解码模型。模型培训的预期-质量和变异性推算法都是先进和比较的。我们展示了非线性DS基准,即我们的算法能够有效地补偿一个数据渠道的过度紧张性或缺少的神经动态数据,通过一个数据链路段,从而展示了不同数据流学基础数据链路。