The fruits of science are relationships made comprehensible, often by way of approximation. While deep learning is an extremely powerful way to find relationships in data, its use in science has been hindered by the difficulty of understanding the learned relationships. The Information Bottleneck (IB) is an information theoretic framework for understanding a relationship between an input and an output in terms of a trade-off between the fidelity and complexity of approximations to the relationship. Here we show that a crucial modification -- distributing bottlenecks across multiple components of the input -- opens fundamentally new avenues for interpretable deep learning in science. The Distributed Information Bottleneck throttles the downstream complexity of interactions between the components of the input, deconstructing a relationship into meaningful approximations found through deep learning without requiring custom-made datasets or neural network architectures. Applied to a complex system, the approximations illuminate aspects of the system's nature by restricting -- and monitoring -- the information about different components incorporated into the approximation. We demonstrate the Distributed IB's explanatory utility in systems drawn from applied mathematics and condensed matter physics. In the former, we deconstruct a Boolean circuit into approximations that isolate the most informative subsets of input components without requiring exhaustive search. In the latter, we localize information about future plastic rearrangement in the static structure of a sheared glass, and find the information to be more or less diffuse depending on the system's preparation. By way of a principled scheme of approximations, the Distributed IB brings much-needed interpretability to deep learning and enables unprecedented analysis of information flow through a system.
翻译:科学的果实被人们所理解, 通常是近似。 虽然深层次的学习是找到数据关系的极其有力的方法, 但它在科学中的应用却因难以理解所学关系而受阻。 信息博特内克(IB)是一个信息理论框架, 用来理解输入和输出之间的关系, 即精确度和复杂度与关系之间的权衡。 我们在这里显示, 关键的修改 — — 将瓶颈分布在输入的多个组成部分之间 — — 开启了在科学中可解释的深层次学习的全新的渠道。 信息布特内克( Bottellneck) 分配信息使输入各组成部分之间的相互作用变得非常复杂, 使通过深层次的学习而发现的有意义的近似关系解密, 而不需要定制的数据集或神经网络结构结构。 应用到复杂的系统, 通过限制( ) 和监测 — — 将不同组成部分的信息纳入到近似值中。 我们展示了IB在应用数学和精密物质物理学的系统中的分解性解释效用。 在前一个系统里, 我们通过最不易懂的系统去分层化的系统, 将一个更精确的分类的系统 将一个更分解到更细的系统, 将一个更细的系统变成一个分层的系统,, 将一个更隐化的系统, 将一个更隐化的解的系统, 将一个更细化到更细化到更细化的系统, 将一个更细化到更细的系统。