关于利用LSTM和变压器学习分子动态中的罕见事件的说明 (A Note on Learning Rare Events in Molecular Dynamics using LSTM and Transformer)

Recurrent neural networks for language models like long short-term memory (LSTM) have been utilized as a tool for modeling and predicting long term dynamics of complex stochastic molecular systems. Recently successful examples on learning slow dynamics by LSTM are given with simulation data of low dimensional reaction coordinate. However, in this report we show that the following three key factors significantly affect the performance of language model learning, namely dimensionality of reaction coordinates, temporal resolution and state partition. When applying recurrent neural networks to molecular dynamics simulation trajectories of high dimensionality, we find that rare events corresponding to the slow dynamics might be obscured by other faster dynamics of the system, and cannot be efficiently learned. Under such conditions, we find that coarse graining the conformational space into metastable states and removing recrossing events when estimating transition probabilities between states could greatly help improve the accuracy of slow dynamics learning in molecular dynamics. Moreover, we also explore other models like Transformer, which do not show superior performance than LSTM in overcoming these issues. Therefore, to learn rare events of slow molecular dynamics by LSTM and Transformer, it is critical to choose proper temporal resolution (i.e., saving intervals of MD simulation trajectories) and state partition in high resolution data, since deep neural network models might not automatically disentangle slow dynamics from fast dynamics when both are present in data influencing each other.

翻译：长期短期内存(LSTM)等语言模型的经常性神经网络的经常性神经网络已被用作模拟和预测复杂随机分子系统长期动态的工具。最近LSTM学习慢动态的成功例子中,有低维反应协调的模拟数据。然而,我们在本报告中表明,以下三个关键因素对语言模型学习的绩效有重大影响,即反应坐标的维度、时间分辨率和状态分隔。在对分子动态模拟高维度轨迹适用经常性神经网络时,我们发现与慢动态相对应的罕见事件可能会被系统其他更快的动态所掩盖,并且无法有效学习。在这样的条件下,我们发现在估算各州之间的过渡概率时,将匹配空间分解为元化状态,并消除交叉事件,可以极大地帮助提高分子动态缓慢学习在分子动态动态中学习的准确性。此外,在克服这些问题时,我们还探索了其他模型,这些变异性模型并不显示优于LSTM。因此,学习缓慢分子动态动态动态的罕见事件不会被系统其他更快的动态所掩盖,因此,自LSTM和变动的快速数据解析中的每一个解的快速分辨率,在LSTM和高分辨率中,因此,学习慢分子动态网络的慢分子动态流中,在快速解数据流流中可能是高解的快速解。

相关内容

长短期记忆网络

关注 120

长短期记忆网络(LSTM)是一种用于深度学习领域的人工回归神经网络(RNN)结构。与标准的前馈神经网络不同，LSTM具有反馈连接。它不仅可以处理单个数据点(如图像)，还可以处理整个数据序列(如语音或视频)。例如，LSTM适用于未分段、连接的手写识别、语音识别、网络流量或IDSs(入侵检测系统)中的异常检测等任务。

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【AdaMod】一个新的深度学习优化与记忆（Meet AdaMod: a new deep learning optimizer with memory）

专知会员服务

15+阅读 · 2020年1月13日

【论文】利用Python开发长短时记忆网络，利用深度学习开发序列预测模型（Long Short-Term Memory Networks With Python，Develop Sequence Prediction Models With Deep Learning），246页pdf

专知会员服务

52+阅读 · 2020年1月1日