绕过指数指数时间预处理:通过 Weight-Data 关联预处理进行快速神经网络培训 (Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing)

Over the last decade, deep neural networks have transformed our society, and they are already widely applied in various machine learning applications. State-of-art deep neural networks are becoming larger in size every year to deliver increasing model accuracy, and as a result, model training consumes substantial computing resources and will only consume more in the future. Using current training methods, in each iteration, to process a data point $x \in \mathbb{R}^d$ in a layer, we need to spend $\Theta(md)$ time to evaluate all the $m$ neurons in the layer. This means processing the entire layer takes $\Theta(nmd)$ time for $n$ data points. Recent work [Song, Yang and Zhang, NeurIPS 2021] reduces this time per iteration to $o(nmd)$, but requires exponential time to preprocess either the data or the neural network weights, making it unlikely to have practical usage. In this work, we present a new preprocessing method that simply stores the weight-data correlation in a tree data structure in order to quickly, dynamically detect which neurons fire at each iteration. Our method requires only $O(nmd)$ time in preprocessing and still achieves $o(nmd)$ time per iteration. We complement our new algorithm with a lower bound, proving that assuming a popular conjecture from complexity theory, one could not substantially speed up our algorithm for dynamic detection of firing neurons.

翻译：在过去的十年里,深神经网络改变了我们的社会,并且已经广泛应用于各种机器学习应用中。最先进的深神经网络每年规模越来越大,以提供越来越多的模型准确性。因此,模型培训消耗了大量的计算资源,将来只会消耗更多。使用目前的培训方法,在每次迭代中,处理一个数据点$x\ in\mathbb{R ⁇ d$,我们需要花费$Theta(md)美元的时间来评估层中所有价值百万的神经元。这意味着整个层的处理需要$Theta(nd)美元的数据点的时间。最近的工程[Song, Yang和Zhang, NeurIPS 20211] 将这一时间降低到 $(nd) 美元, 但是需要一个指数化的时间来预处理数据或神经网络的重量, 使得它不可能得到实际的利用。在这项工作中, 我们提出了一个新的预处理方法, 仅仅将重量数据连接在每棵树中, 美元的时间序列中, 需要一种动态的解算方法。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日