SAMP: 自我适应性混合精密混合法示范推断工具箱 (SAMP: A Toolkit for Model Inference with Self-Adaptive Mixed-Precision) - 专知论文

会员服务 ·

0

推断 · INT8 · MoDELS · Performance · PyTorch ·

2022 年 9 月 19 日

SAMP: A Toolkit for Model Inference with Self-Adaptive Mixed-Precision

翻译：SAMP: 自我适应性混合精密混合法示范推断工具箱

Rong Tian,Zijing Zhao,Weijie Liu,Haoyan Liu,Weiquan Mao,Zhe Zhao,Kimmo Yan

from arxiv, 6 pages

The latest industrial inference engines, such as FasterTransformer1 and TurboTransformers, have verified that half-precision floating point (FP16) and 8-bit integer (INT8) quantization can greatly improve model inference speed. However, the existing FP16 or INT8 quantization methods are too complicated, and improper usage will lead to performance damage greatly. In this paper, we develop a toolkit for users to easily quantize their models for inference, in which a Self-Adaptive Mixed-Precision (SAMP) is proposed to automatically control quantization rate by a mixed-precision architecture to balance efficiency and performance. Experimental results show that our SAMP toolkit has a higher speedup than PyTorch and FasterTransformer while ensuring the required performance. In addition, SAMP is based on a modular design, decoupling the tokenizer, embedding, encoder and target layers, which allows users to handle various downstream tasks and can be seamlessly integrated into PyTorch.

翻译：最新的工业推断引擎,如Apper Transferent1和Turbo Transfents,已经证实半精度浮点(FP16)和8位整数(INT8)的量化可以大大提高模型推导速度。然而,现有的FP16或INT8量化方法过于复杂,不当使用会导致性能损害。在本文中,我们开发了一个工具包,用户可以方便地对其推论模型进行定量分析,其中提议采用自开发混合精度(SAMP),通过混合精度结构自动控制四分法率,以平衡效率和性能。实验结果表明,我们的SAMP工具包比PyTorrch和快速转换工具更快,同时确保所要求的性能。此外,SAMP还基于模块设计,解码器、嵌入、编码和目标层,使用户能够处理各种下游任务,并可以无缝地融入PyTorch。

0

相关内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

专知会员服务

99+阅读 · 2020年7月6日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

【微软-ACL2020】TinyMBERT: Multi-Stage Distillation Framework for Massive Multi-lingual NER

【微软-ACL2020】TinyMBERT: Multi-Stage Distillation Framework for Massive Multi-lingual NER

专知会员服务

36+阅读 · 2020年4月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

回声干扰抑制中的自适应信号处理算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

机体锌稳衡在锌抗肉鸡沙门氏菌感染的作用及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

颗粒增强金属基复合材料搅拌摩擦焊残余应力的多尺度模拟

国家自然科学基金

0+阅读 · 2013年12月31日

表面等离激元增强AlGaN基深紫外LED研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向人与Agent混合的多团队协作仿真训练方法研究

国家自然科学基金

19+阅读 · 2012年12月31日

CyclinE2-3'UTR竞争性结合miR-30e上调Notch1促进鼻咽癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Sema4C对斑马鱼血管和淋巴管生成的调控机制及其在肿瘤转移中的功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

思茅藤C21甾体抗骨质疏松活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

新疆维吾尔族MODY家系相关危险因素与神经－内分泌－免疫调节网络的相关性研究

国家自然科学基金

0+阅读 · 2008年12月31日

Can language models handle recursively nested grammatical structures? A case study on comparing models and humans

Arxiv

0+阅读 · 2022年10月27日

Personalized Dialogue Generation with Persona-Adaptive Attention

Arxiv

0+阅读 · 2022年10月27日

Berry-Esseen bounds for design-based causal inference with possibly diverging treatment levels and varying group sizes

Arxiv

0+阅读 · 2022年10月26日

On the Utility of Self-supervised Models for Prosody-related Tasks

Arxiv

0+阅读 · 2022年10月26日

Few-shot Domain Adaptation for IMU Denoising

Arxiv

1+阅读 · 2022年10月25日

Differentially Private Language Models for Secure Data Sharing

Arxiv

0+阅读 · 2022年10月25日

Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models

Arxiv

0+阅读 · 2022年10月24日

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Arxiv

20+阅读 · 2020年12月22日

nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation

Arxiv

12+阅读 · 2018年9月27日

Aspect Based Sentiment Analysis with Gated Convolutional Networks

Arxiv

12+阅读 · 2018年5月18日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

专知会员服务

99+阅读 · 2020年7月6日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

【微软-ACL2020】TinyMBERT: Multi-Stage Distillation Framework for Massive Multi-lingual NER

【微软-ACL2020】TinyMBERT: Multi-Stage Distillation Framework for Massive Multi-lingual NER

专知会员服务

36+阅读 · 2020年4月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Can language models handle recursively nested grammatical structures? A case study on comparing models and humans

Arxiv

0+阅读 · 2022年10月27日

Personalized Dialogue Generation with Persona-Adaptive Attention

Arxiv

0+阅读 · 2022年10月27日

Berry-Esseen bounds for design-based causal inference with possibly diverging treatment levels and varying group sizes

Arxiv

0+阅读 · 2022年10月26日

On the Utility of Self-supervised Models for Prosody-related Tasks

Arxiv

0+阅读 · 2022年10月26日

Few-shot Domain Adaptation for IMU Denoising

Arxiv

1+阅读 · 2022年10月25日

Differentially Private Language Models for Secure Data Sharing

Arxiv

0+阅读 · 2022年10月25日

Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models

Arxiv

0+阅读 · 2022年10月24日

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Arxiv

20+阅读 · 2020年12月22日

nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation

Arxiv

12+阅读 · 2018年9月27日

Aspect Based Sentiment Analysis with Gated Convolutional Networks

Arxiv

12+阅读 · 2018年5月18日

相关基金

回声干扰抑制中的自适应信号处理算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

机体锌稳衡在锌抗肉鸡沙门氏菌感染的作用及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

颗粒增强金属基复合材料搅拌摩擦焊残余应力的多尺度模拟

国家自然科学基金

0+阅读 · 2013年12月31日

表面等离激元增强AlGaN基深紫外LED研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向人与Agent混合的多团队协作仿真训练方法研究

国家自然科学基金

19+阅读 · 2012年12月31日

CyclinE2-3'UTR竞争性结合miR-30e上调Notch1促进鼻咽癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Sema4C对斑马鱼血管和淋巴管生成的调控机制及其在肿瘤转移中的功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

思茅藤C21甾体抗骨质疏松活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

新疆维吾尔族MODY家系相关危险因素与神经－内分泌－免疫调节网络的相关性研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员