NN-LUT: 高效变压器推断的非皮下操作神经近似 (NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference) - 专知论文

会员服务 ·

0

近似 · 变换 · 通用近似器 · 推断 · 层规范化 ·

2021 年 12 月 3 日

NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference

翻译：NN-LUT: 高效变压器推断的非皮下操作神经近似

Joonsang Yu,Junki Park,Seongmin Park,Minsoo Kim,Sihwa Lee,Dong Hyun Lee,Jungwook Choi

from arxiv, 7 pages, 3 figures

Non-linear operations such as GELU, Layer normalization, and Softmax are essential yet costly building blocks of Transformer models. Several prior works simplified these operations with look-up tables or integer computations, but such approximations suffer inferior accuracy or considerable hardware cost with long latency. This paper proposes an accurate and hardware-friendly approximation framework for efficient Transformer inference. Our framework employs a simple neural network as a universal approximator with its structure equivalently transformed into a LUT. The proposed framework called NN-LUT can accurately replace all the non-linear operations in popular BERT models with significant reductions in area, power consumption, and latency.

翻译：非线性操作,如GELU、图层正常化和Softmax等,是变异模型的基本但成本高昂的构件。前几部工程用查看表或整数计算方法简化了这些操作,但这类近似值的准确性低,或硬件成本高,且长期潜伏。本文件为高效变异器推断提出了一个准确和硬件友好的近似框架。我们的框架使用一个简单的神经网络作为通用近似器,其结构也相应转化为LUT。拟议的称为NN-LUT的框架可以准确地取代流行的BERT模型中的所有非线性操作,在面积、电力消耗和延缓度方面大幅减少。

1

相关内容

生成式对抗网络异常检测，GANs for Anomaly Detection

专知会员服务

34+阅读 · 2021年9月16日

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

【O'Reilly TensorFlow Conference 2019】MLIR：加速人工智能（MLIR: Accelerating AI）

【O'Reilly TensorFlow Conference 2019】MLIR：加速人工智能（MLIR: Accelerating AI）

专知会员服务

7+阅读 · 2019年11月14日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

AAAI2020 图相关论文集

AAAI2020 图相关论文集

图与推荐

11+阅读 · 2020年7月15日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

计算机视觉的不同任务

计算机视觉的不同任务

专知

5+阅读 · 2018年8月27日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

已删除

将门创投

4+阅读 · 2017年12月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

pNLP-Mixer: an Efficient all-MLP Architecture for Language

Arxiv

0+阅读 · 2022年2月9日

A Hybrid Inference System for Improved Curvature Estimation in the Level-Set Method Using Machine Learning

Arxiv

0+阅读 · 2022年2月9日

Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning

Arxiv

0+阅读 · 2022年2月8日

Efficient approximation of cardiac mechanics through reduced order modeling with deep learning-based operator approximation

Arxiv

0+阅读 · 2022年2月8日

On Using Transformers for Speech-Separation

Arxiv

1+阅读 · 2022年2月6日

EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators

Arxiv

0+阅读 · 2022年2月4日

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Arxiv

21+阅读 · 2021年9月2日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Improved Deep Embeddings for Inferencing with Multi-Layered Networks

Improved Deep Embeddings for Inferencing with Multi-Layered Networks

Arxiv

3+阅读 · 2019年3月1日

Deep Structured Prediction with Nonlinear Output Transformations

Arxiv

4+阅读 · 2018年11月1日

VIP会员

文章信息

相关主题

通用近似器

相关VIP内容

生成式对抗网络异常检测，GANs for Anomaly Detection

专知会员服务

34+阅读 · 2021年9月16日

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

【O'Reilly TensorFlow Conference 2019】MLIR：加速人工智能（MLIR: Accelerating AI）

【O'Reilly TensorFlow Conference 2019】MLIR：加速人工智能（MLIR: Accelerating AI）

专知会员服务

7+阅读 · 2019年11月14日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

AAAI2020 图相关论文集

AAAI2020 图相关论文集

图与推荐

11+阅读 · 2020年7月15日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

计算机视觉的不同任务

计算机视觉的不同任务

专知

5+阅读 · 2018年8月27日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

已删除

将门创投

4+阅读 · 2017年12月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

pNLP-Mixer: an Efficient all-MLP Architecture for Language

Arxiv

0+阅读 · 2022年2月9日

A Hybrid Inference System for Improved Curvature Estimation in the Level-Set Method Using Machine Learning

Arxiv

0+阅读 · 2022年2月9日

Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning

Arxiv

0+阅读 · 2022年2月8日

Efficient approximation of cardiac mechanics through reduced order modeling with deep learning-based operator approximation

Arxiv

0+阅读 · 2022年2月8日

On Using Transformers for Speech-Separation

Arxiv

1+阅读 · 2022年2月6日

EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators

Arxiv

0+阅读 · 2022年2月4日

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Arxiv

21+阅读 · 2021年9月2日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Improved Deep Embeddings for Inferencing with Multi-Layered Networks

Improved Deep Embeddings for Inferencing with Multi-Layered Networks

Arxiv

3+阅读 · 2019年3月1日

Deep Structured Prediction with Nonlinear Output Transformations

Arxiv

4+阅读 · 2018年11月1日

微信扫码咨询专知VIP会员