Each year, deep learning demonstrate new and improved empirical results with deeper and wider neural networks. Meanwhile, with existing theoretical frameworks, it is difficult to analyze networks deeper than two layers without resorting to counting parameters or encountering sample complexity bounds that are exponential in depth. Perhaps it may be fruitful to try to analyze modern machine learning under a different lens. In this paper, we propose a novel information-theoretic framework with its own notions of regret and sample complexity for analyzing the data requirements of machine learning. We use this framework to study the sample complexity of learning from data generated by deep ReLU neural networks and deep networks that are infinitely wide but have a bounded sum of weights. We establish that the sample complexity of learning under these data generating processes is at most linear and quadratic, respectively, in network depth.
翻译:每年,深层次的学习展示出与更深、更广的神经网络有关的新的、改进的经验结果。与此同时,利用现有的理论框架,很难分析深于两层的网络,而不必采用数数参数,或遇到深度指数指数化的抽样复杂界限。也许尝试在不同的镜头下分析现代机器的学习或许是有成效的。在本文中,我们提出一个新的信息理论框架,它有自己的遗憾感和样本复杂性概念来分析机器学习的数据要求。我们利用这个框架来研究从深ReLU神经网络和深层网络产生的数据中学习的抽样复杂性,这些数据是无限宽的,但有捆绑的权重。我们确定,在这些数据生成过程中学习的样本复杂性在网络深度上最多是线性和二次式的。