在次水下时间培训多层超光化神经网络 (Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time)

We consider the problem of training a multi-layer over-parametrized neural networks to minimize the empirical risk induced by a loss function. In the typical setting of over-parametrization, the network width $m$ is much larger than the data dimension $d$ and number of training samples $n$ ($m=\mathrm{poly}(n,d)$), which induces a prohibitive large weight matrix $W\in \mathbb{R}^{m\times m}$ per layer. Naively, one has to pay $O(m^2)$ time to read the weight matrix and evaluate the neural network function in both forward and backward computation. In this work, we show how to reduce the training cost per iteration, specifically, we propose a framework that uses $m^2$ cost only in the initialization phase and achieves a truly subquadratic cost per iteration in terms of $m$, i.e., $m^{2-\Omega(1)}$ per iteration. To obtain this result, we make use of various techniques, including a shifted ReLU-based sparsifier, a lazy low rank maintenance data structure, fast rectangular matrix multiplication, tensor-based sketching techniques and preconditioning.

翻译：我们考虑了培训多层超平衡神经网络以最大限度地减少损失功能引起的实验风险的问题。在典型的超平衡环境下,网络宽度百万美元大大大于数据维度(美元)和培训样本数量(美元)(n,d)美元),这导致每层超重超重的超重总基数(美元)高得令人望而却步。纳里,一个人必须花(m)2美元的时间来阅读重量矩阵并评估前向和后向计算中神经网络功能。在这项工作中,我们展示了如何降低每次循环的培训成本,具体地说,我们提出了一个框架,仅在初始化阶段使用2美元成本,并按每层每层每升取1美元(美元),即,即$2美元,俄加(1)美元。为了取得这一结果,我们使用了各种技术,包括改变的弹性弹性弹性、弹性的甚低级基质结构、弹性的甚高基质数据采集器。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日