Quantifying the information content in a neural network model is essentially estimating the model's Kolmogorov complexity. Recent success of prequential coding on neural networks points to a promising path of deriving an efficient description length of a model. We propose a practical measure of the generalizable information in a neural network model based on prequential coding, which we term Information Transfer ($L_{IT}$). Theoretically, $L_{IT}$ is an estimation of the generalizable part of a model's information content. In experiments, we show that $L_{IT}$ is consistently correlated with generalizable information and can be used as a measure of patterns or "knowledge" in a model or a dataset. Consequently, $L_{IT}$ can serve as a useful analysis tool in deep learning. In this paper, we apply $L_{IT}$ to compare and dissect information in datasets, evaluate representation models in transfer learning, and analyze catastrophic forgetting and continual learning algorithms. $L_{IT}$ provides an information perspective which helps us discover new insights into neural network learning.
翻译:在神经网络模型中量化信息内容基本上是在估计模型的 Kolmogorov 复杂程度。神经网络前的编码最近的成功表明一种很有希望的路径,可以得出一个模型的有效描述长度。我们建议对基于预先编码的神经网络模型中的一般信息进行实际衡量,我们称之为信息传输(L ⁇ IT}$ )。理论上,$L ⁇ IT}$是对模型信息内容一般部分的估计。在实验中,我们显示$L ⁇ IT}与一般信息始终相关,并且可以用作模型或数据集中的一种模式或“知识”的衡量标准。因此,$L ⁇ IT}美元可以作为深层学习的有用分析工具。在本文中,我们应用$L ⁇ IT}美元来比较和解析数据集中的信息,评价转移学习中的代表模式,分析灾难性的遗忘和持续学习算法。$L ⁇ IT}提供一种信息视角,帮助我们发现神经网络学习的新洞察力。