Identifying systems with high-dimensional inputs and outputs, such as systems measured by video streams, is a challenging problem with numerous applications in robotics, autonomous vehicles and medical imaging. In this paper, we propose a novel non-linear state-space identification method starting from high-dimensional input and output data. Multiple computational and conceptual advances are combined to handle the high-dimensional nature of the data. An encoder function, represented by a neural network, is introduced to learn a reconstructability map to estimate the model states from past inputs and outputs. This encoder function is jointly learned with the dynamics. Furthermore, multiple computational improvements, such as an improved reformulation of multiple shooting and batch optimization, are proposed to keep the computational time under control when dealing with high-dimensional and large datasets. We apply the proposed method to a video stream of a simulated environment of a controllable ball in a unit box. The simulation study shows low simulation error with excellent long term prediction for the obtained model using the proposed method.
翻译:以视频流测量的系统等高维输入和输出系统识别系统,是机器人、自主飞行器和医疗成像等多种应用中一个具有挑战性的问题。 在本文中,我们提议了一种新的非线性状态空间识别方法,从高维输入和输出数据开始。多重计算和概念进步相结合,处理数据的高维性质。由神经网络代表的编码器功能被引入学习可重建地图,以根据过去的输入和输出来估计模型状态。这个编码器功能是与动态共同学习的。此外,还提议了多种计算改进,例如改进多维和批量优化,以便在处理高维和大数据集时将计算时间控制在控制之下。我们将拟议方法应用于一个单元盒中可控球模拟环境的视频流。模拟研究显示低模拟错误,并用拟议方法对获得模型进行极好的长期预测。