具有固定蝴蝶结构的浅状线性网络:理论和实践 (Sparse Linear Networks with a Fixed Butterfly Structure: Theory and Practice)

A butterfly network consists of logarithmically many layers, each with a linear number of non-zero weights (pre-specified). The fast Johnson-Lindenstrauss transform (FJLT) can be represented as a butterfly network followed by a projection onto a random subset of the coordinates. Moreover, a random matrix based on FJLT with high probability approximates the action of any matrix on a vector. Motivated by these facts, we propose to replace a dense linear layer in any neural network by an architecture based on the butterfly network. The proposed architecture significantly improves upon the quadratic number of weights required in a standard dense layer to nearly linear with little compromise in expressibility of the resulting operator. In a collection of wide variety of experiments, including supervised prediction on both the NLP and vision data, we show that this not only produces results that match and at times outperform existing well-known architectures, but it also offers faster training and prediction in deployment. To understand the optimization problems posed by neural networks with a butterfly network, we also study the optimization landscape of the encoder-decoder network, where the encoder is replaced by a butterfly network followed by a dense linear layer in smaller dimension. Theoretical result presented in the paper explains why the training speed and outcome are not compromised by our proposed approach.

翻译：蝴蝶网络由对数多层组成, 每一层都有非零重量的线性数量( 预先指定 ) 。快速的约翰逊- 林登斯特劳斯变异( FJLT) 可以作为一个蝴蝶网络代表, 并随后对坐标的随机子集进行投影。此外, 一个基于FJLT的随机矩阵, 极有可能与矢量上任何矩阵的动作相近。基于这些事实, 我们提议用一个基于蝴蝶网络的建筑来取代任何神经网络中的稠密线性层。拟议的结构将标准稠密层所需重量的四倍数大幅改进至近线性, 且结果操作者的清晰度几乎没有妥协性。在一系列广泛的实验中, 包括对 NLP 和视觉数据的监督预测, 我们显示, 这不仅产生匹配结果, 有时甚至比任何已知的矢量结构更完美, 我们还在部署过程中提供更快的培训和预测。为了理解由蝴蝶网络构成的最优化问题, 我们还研究一个遵循的编码- 电极网络的优化版面面面, 并用一个不易变固的图像化的网络结果来解释。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【硬核书】数学博弈论与应用，431页pdf，Mathematical Game Theory and Applications

专知会员服务

170+阅读 · 2020年4月18日