数据驱动的粒子动力学：非平衡系统中涌现行为的结构保持粗粒化方法 (Data-driven particle dynamics: Structure-preserving coarse-graining for emergent behavior in non-equilibrium systems)

Multiscale systems are ubiquitous in science and technology, but are notoriously challenging to simulate as short spatiotemporal scales must be appropriately linked to emergent bulk physics. When expensive high-dimensional dynamical systems are coarse-grained into low-dimensional models, the entropic loss of information leads to emergent physics which are dissipative, history-dependent, and stochastic. To machine learn coarse-grained dynamics from time-series observations of particle trajectories, we propose a framework using the metriplectic bracket formalism that preserves these properties by construction; most notably, the framework guarantees discrete notions of the first and second laws of thermodynamics, conservation of momentum, and a discrete fluctuation-dissipation balance crucial for capturing non-equilibrium statistics. We introduce the mathematical framework abstractly before specializing to a particle discretization. As labels are generally unavailable for entropic state variables, we introduce a novel self-supervised learning strategy to identify emergent structural variables. We validate the method on benchmark systems and demonstrate its utility on two challenging examples: (1) coarse-graining star polymers at challenging levels of coarse-graining while preserving non-equilibrium statistics, and (2) learning models from high-speed video of colloidal suspensions that capture coupling between local rearrangement events and emergent stochastic dynamics. We provide open-source implementations in both PyTorch and LAMMPS, enabling large-scale inference and extensibility to diverse particle-based systems.

翻译：多尺度系统在科学与技术中普遍存在，但因其短时空尺度必须与涌现的宏观物理特性恰当关联而 notoriously 难以模拟。当昂贵的高维动力系统被粗粒化为低维模型时，信息熵的损失会导致涌现的物理行为呈现耗散性、历史依赖性和随机性。为了从粒子轨迹的时间序列观测中机器学习粗粒化动力学，我们提出了一种基于度量辛括号形式的框架，该框架通过构造保持这些特性；尤为重要的是，该框架保证了热力学第一和第二定律的离散形式、动量守恒，以及对捕捉非平衡统计至关重要的离散涨落-耗散平衡。我们首先抽象地引入数学框架，随后专门化到粒子离散化情形。由于熵态变量的标签通常难以获得，我们提出了一种新颖的自监督学习策略来识别涌现的结构变量。我们在基准系统上验证了该方法，并通过两个具有挑战性的示例展示了其实用性：(1) 在具有挑战性的粗粒化水平下对星形聚合物进行粗粒化，同时保持非平衡统计特性；(2) 从胶体悬浮液的高速视频中学习模型，以捕捉局部重排事件与涌现随机动力学之间的耦合。我们提供了 PyTorch 和 LAMMPS 的开源实现，支持大规模推理并适用于多样化的粒子系统。