ReduNet:从最大限度降低利率原则出发的白箱深水网络 (ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction)

from arxiv, This paper integrates previous two manuscripts: arXiv:2006.08558 and arXiv:2010.14765, with significantly improved organization, presentation, and new results

This work attempts to provide a plausible theoretical framework that aims to interpret modern deep (convolutional) networks from the principles of data compression and discriminative representation. We show that for high-dimensional multi-class data, the optimal linear discriminative representation maximizes the coding rate difference between the whole dataset and the average of all the subsets. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction objective naturally leads to a multi-layer deep network, named ReduNet, that shares common characteristics of modern deep networks. The deep layered architectures, linear and nonlinear operators, and even parameters of the network are all explicitly constructed layer-by-layer via forward propagation, instead of learned via back propagation. All components of so-obtained "white-box" network have precise optimization, statistical, and geometric interpretation. Moreover, all linear operators of the so-derived network naturally become multi-channel convolutions when we enforce classification to be rigorously shift-invariant. The derivation also indicates that such a deep convolution network is significantly more efficient to construct and learn in the spectral domain. Our preliminary simulations and experiments clearly verify the effectiveness of both the rate reduction objective and the associated ReduNet. All code and data are available at https://github.com/Ma-Lab-Berkeley.

翻译：这项工作试图提供一个可信的理论框架,以便从数据压缩和歧视性代表性的原则中解释现代深层(革命)网络。我们表明,对于高维多层数据,最佳的线性偏向代表法最大限度地扩大了整个数据集和所有子集的平均值之间的编码率差异。我们显示,为优化降速目标而建立的基本迭代梯度梯度计划自然导致一个多层深层网络,名为ReduNet,共享现代深深网络的共同特征。深层结构、线性和非线性操作员,甚至网络的参数,都是通过前向传播而明确地逐层逐层构建的,而不是通过后向传播而学习。如此可见的“白箱”网络的所有组件都有精确的优化、统计和几何解释。此外,当我们强制分类以严格易变变量时,这种深层的网络结构、线性和非线性操作员,以及网络的参数都是通过前向传播,而不是通过后向传播,逐层逐层构建和学习的。我们初步的“白箱”网络的所有组成部分都有精确的优化、统计、统计和几度解释。此外网络的所有线性操作者自然会变成一个多波变变变的曲线。。我们现有的数据率和RD-Mab 和RD-Mab

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日