The success of Convolutional Neural Networks (CNNs) in computer vision is mainly driven by their strong inductive bias, which is strong enough to allow CNNs to solve vision-related tasks with random weights, meaning without learning. Similarly, Long Short-Term Memory (LSTM) has a strong inductive bias towards storing information over time. However, many real-world systems are governed by conservation laws, which lead to the redistribution of particular quantities -- e.g. in physical and economical systems. Our novel Mass-Conserving LSTM (MC-LSTM) adheres to these conservation laws by extending the inductive bias of LSTM to model the redistribution of those stored quantities. MC-LSTMs set a new state-of-the-art for neural arithmetic units at learning arithmetic operations, such as addition tasks, which have a strong conservation law, as the sum is constant over time. Further, MC-LSTM is applied to traffic forecasting, modelling a pendulum, and a large benchmark dataset in hydrology, where it sets a new state-of-the-art for predicting peak flows. In the hydrology example, we show that MC-LSTM states correlate with real-world processes and are therefore interpretable.
翻译:计算机视觉中的进化神经网络(CNNs)之所以成功,主要是因为计算机视觉中的进化神经网络(CNNs)之所以成功,主要是因为其强烈的进化偏差,这种偏差足够强大,使CNN能够以随机的重量(即不学习)解决与视觉有关的任务。同样,长期短期内存(LSTM)对长期信息储存有着强烈的进化偏差。然而,许多实体世界系统受保护法的制约,这导致特定数量的再分配 -- -- 例如,在物理和经济系统中。我们新的大众保护LSTM(MC-LSTM)坚持这些保护法,将LSTM(MC-LSTM)的进化偏差扩展为这些储存量的再分配的模型。MC-LSTM(M-M)为神经数学单位在学习算术操作时设置了新的状态,例如附加任务,具有强有力的保存法,因为总和随时间而保持不变。此外,MC-LSTM(MC-M-M)用于交通预报,以及水文中的大型基准数据集,在其中为预测真实的潮流和解释过程,因此,我们展示的状态。