具有平铺矩阵-矩阵矩阵乘法的空间加速器结构评价 (Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication) - 专知论文

会员服务 ·

0

优化器 · INTERACT · Performer · Buffer（公司） · Machine Learning ·

2021 年 6 月 19 日

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication

翻译：具有平铺矩阵-矩阵矩阵乘法的空间加速器结构评价

Gordon E. Moon,Hyoukjun Kwon,Geonhwa Jeong,Prasanth Chatarasi,Sivasankaran Rajamanickam,Tushar Krishna

There is a growing interest in custom spatial accelerators for machine learning applications. These accelerators employ a spatial array of processing elements (PEs) interacting via custom buffer hierarchies and networks-on-chip. The efficiency of these accelerators comes from employing optimized dataflow (i.e., spatial/temporal partitioning of data across the PEs and fine-grained scheduling) strategies to optimize data reuse. The focus of this work is to evaluate these accelerator architectures using a tiled general matrix-matrix multiplication (GEMM) kernel. To do so, we develop a framework that finds optimized mappings (dataflow and tile sizes) for a tiled GEMM for a given spatial accelerator and workload combination, leveraging an analytical cost model for runtime and energy. Our evaluations over five spatial accelerators demonstrate that the tiled GEMM mappings systematically generated by our framework achieve high performance on various GEMM workloads and accelerators.

翻译：对机械学习应用的定制空间加速器越来越感兴趣。这些加速器采用通过自定义缓冲等级和网络在芯片上互动的空间处理元件阵列。这些加速器的效率来自采用优化的数据流(即数据在PE之间的空间/时际分隔和细微的排程)优化数据再利用的战略。这项工作的重点是使用一个加压的通用矩阵矩阵-矩阵倍增内核(GEMM)来评估这些加速器结构。为了做到这一点,我们开发了一个框架,为特定空间加速器和工作量组合找到优化的GEMM绘图(数据流和体积大小),利用运行时间和能量的分析成本模型。我们对五个空间加速器的评价表明,我们框架系统生成的压压式GEMM绘图在各种GEMM工作量和加速器上取得了很高的性能。

0

相关内容

优化器

【SIAM2021】机器学习最优传输，63页ppt教程

专知会员服务

47+阅读 · 2021年7月26日

无监督学习：深度生成模型，35页ppt

专知会员服务

42+阅读 · 2021年7月4日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知会员服务

78+阅读 · 2020年7月23日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

专知会员服务

9+阅读 · 2019年12月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

一文读懂Attention机制

一文读懂Attention机制

机器学习与推荐算法

63+阅读 · 2020年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

An extra-components method for evaluating fast matrix-vector multiplication with special functions

An extra-components method for evaluating fast matrix-vector multiplication with special functions

Arxiv

0+阅读 · 2021年8月23日

ATTACC the Quadratic Bottleneck of Attention Layers

Arxiv

0+阅读 · 2021年8月21日

Irish Property Price Estimation Using A Flexible Geo-spatial Smoothing Approach: What is the Impact of an Address?

Irish Property Price Estimation Using A Flexible Geo-spatial Smoothing Approach: What is the Impact of an Address?

Arxiv

0+阅读 · 2021年8月20日

Field Trace Polynomial Codes for Secure Distributed Matrix Multiplication

Arxiv

0+阅读 · 2021年8月19日

Computational graphs for matrix functions

Arxiv

0+阅读 · 2021年8月19日

On Accelerating Distributed Convex Optimizations

On Accelerating Distributed Convex Optimizations

Arxiv

0+阅读 · 2021年8月19日

Using Multilevel Circulant Matrix Approximate to Speed Up Kernel Logistic Regression

Arxiv

0+阅读 · 2021年8月19日

Cost-Efficient RIS-Aided Channel Estimation via Rank-One Matrix Factorization

Arxiv

0+阅读 · 2021年8月19日

Exploring Spatial Indexing for Accelerated Feature Retrieval in HPC

Arxiv

0+阅读 · 2021年8月18日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

VIP会员

文章信息

相关主题

Buffer（公司）

Machine Learning

相关VIP内容

【SIAM2021】机器学习最优传输，63页ppt教程

专知会员服务

47+阅读 · 2021年7月26日

无监督学习：深度生成模型，35页ppt

专知会员服务

42+阅读 · 2021年7月4日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知会员服务

78+阅读 · 2020年7月23日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

专知会员服务

9+阅读 · 2019年12月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

一文读懂Attention机制

一文读懂Attention机制

机器学习与推荐算法

63+阅读 · 2020年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

An extra-components method for evaluating fast matrix-vector multiplication with special functions

An extra-components method for evaluating fast matrix-vector multiplication with special functions

Arxiv

0+阅读 · 2021年8月23日

ATTACC the Quadratic Bottleneck of Attention Layers

Arxiv

0+阅读 · 2021年8月21日

Irish Property Price Estimation Using A Flexible Geo-spatial Smoothing Approach: What is the Impact of an Address?

Irish Property Price Estimation Using A Flexible Geo-spatial Smoothing Approach: What is the Impact of an Address?

Arxiv

0+阅读 · 2021年8月20日

Field Trace Polynomial Codes for Secure Distributed Matrix Multiplication

Arxiv

0+阅读 · 2021年8月19日

Computational graphs for matrix functions

Arxiv

0+阅读 · 2021年8月19日

On Accelerating Distributed Convex Optimizations

On Accelerating Distributed Convex Optimizations

Arxiv

0+阅读 · 2021年8月19日

Using Multilevel Circulant Matrix Approximate to Speed Up Kernel Logistic Regression

Arxiv

0+阅读 · 2021年8月19日

Cost-Efficient RIS-Aided Channel Estimation via Rank-One Matrix Factorization

Arxiv

0+阅读 · 2021年8月19日

Exploring Spatial Indexing for Accelerated Feature Retrieval in HPC

Arxiv

0+阅读 · 2021年8月18日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

微信扫码咨询专知VIP会员