When modeling dynamical systems from real-world data samples, the distribution of data often changes according to the environment in which they are captured, and the dynamics of the system itself vary from one environment to another. Generalizing across environments thus challenges the conventional frameworks. The classical settings suggest either considering data as i.i.d. and learning a single model to cover all situations or learning environment-specific models. Both are sub-optimal: the former disregards the discrepancies between environments leading to biased solutions, while the latter does not exploit their potential commonalities and is prone to scarcity problems. We propose LEADS, a novel framework that leverages the commonalities and discrepancies among known environments to improve model generalization. This is achieved with a tailored training formulation aiming at capturing common dynamics within a shared model while additional terms capture environment-specific dynamics. We ground our approach in theory, exhibiting a decrease in sample complexity with our approach and corroborate these results empirically, instantiating it for linear dynamics. Moreover, we concretize this framework for neural networks and evaluate it experimentally on representative families of nonlinear dynamics. We show that this new setting can exploit knowledge extracted from environment-dependent data and improves generalization for both known and novel environments.
翻译:当从现实世界的数据样本中模拟动态系统时,数据分布往往根据采集数据的环境而变化,系统本身的动态因环境而异。在各种环境中普遍推广,因此对传统框架提出了挑战。古典设置表明,要么将数据视为i.d.,学习单一模型,以涵盖所有情况或学习环境特有模型。两者都是次优异的:前者忽视导致偏向解决方案的环境之间的差异,而后者没有利用这些环境的潜在共同点,容易出现稀缺问题。我们提议LEADS,这是一个利用已知环境的共性和差异来改进模型的通用化的新框架。这是通过一个有针对性的培训方案实现的,目的是在共同模型中捕捉共同的动态,而其他术语则捕捉环境特有的动态。我们从理论上确定我们的方法,展示样本复杂性的下降,并用经验来证实这些结果,即为线性动态进行即刻录。此外,我们将这个框架具体化了神经网络,并对非线性动态的代表性家庭进行了实验性评估。我们表明,这一新的环境可以利用从已知的环境中提取的知识,改进一般环境。