In the past decade, deep neural networks have seen unparalleled improvements that continue to impact every aspect of today's society. With the development of high performance GPUs and the availability of vast amounts of data, learning capabilities of ML systems have skyrocketed, going from classifying digits in a picture to beating world-champions in games with super-human performance. However, even as ML models continue to achieve new frontiers, their practical success has been hindered by the lack of a deep theoretical understanding of their inner workings. Fortunately, a known information-theoretic method called the information bottleneck theory has emerged as a promising approach to better understand the learning dynamics of neural networks. In principle, IB theory models learning as a trade-off between the compression of the data and the retainment of information. The goal of this survey is to provide a comprehensive review of IB theory covering it's information theoretic roots and the recently proposed applications to understand deep learning models.
翻译:过去十年来,深层神经网络目睹了前所未有的改善,继续影响当今社会的每个方面。随着高性能的GPU的开发以及大量数据的提供,ML系统的学习能力飞涨,从将数字在图片中分类到在超人性表现的游戏中打打世界杯。然而,即使ML模型继续取得新的疆界,其实际成功也因对其内部工作缺乏深刻的理论理解而受阻。幸运的是,一个称为信息瓶颈理论的已知信息理论已成为一种很有希望的方法,以更好地了解神经网络的学习动态。原则上,IB理论模型学习是数据压缩和信息保存之间的一种权衡。这次调查的目的是全面审查IB理论,其中涉及信息理论的理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论的理论根源以及最近提出的了解深层学习模型的应用。