Cora Tensor 编译者: 带最小斜度的拖网式电锯编译者汇编 (The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding) - 专知论文

会员服务 ·

0

编译器 · 变换 · Performer · UniFormer · CASES ·

2021 年 10 月 29 日

The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding

翻译：Cora Tensor 编译者: 带最小斜度的拖网式电锯编译者汇编

Pratik Fegade,Tianqi Chen,Phillip B. Gibbons,Todd C. Mowry

from arxiv, 23 pages, 25 figures and 10 tables

There is often variation in the shape and size of input data used for deep learning. In many cases, such data can be represented using tensors with non-uniform shapes, or ragged tensors. Due to limited and non-portable support for efficient execution on ragged tensors, current deep learning frameworks generally use techniques such as padding and masking to make the data shapes uniform and then offload the computations to optimized kernels for dense tensor algebra. Such techniques can, however, lead to a lot of wasted computation and therefore, a loss in performance. This paper presents CoRa, a tensor compiler that allows users to easily generate efficient code for ragged tensor operators targeting a wide range of CPUs and GPUs. Evaluating CoRa on a variety of operators on ragged tensors as well as on an encoder layer of the transformer model, we find that CoRa (i)performs competitively with hand-optimized implementations of the operators and the transformer encoder and (ii) achieves, over PyTorch, a 1.6X geomean speedup for the encoder on an Nvidia GPU and a 1.86X geomean speedup for the multi-head attention module used in transformers on an ARM CPU.

翻译：用于深层学习的输入数据的形状和大小往往各有不同,在许多情况下,这些数据可以用不统一形状或破碎的加压器的加压器来表示。由于对压碎的加压器和GPUs的高效执行支持有限且非便携式,目前的深层学习框架通常使用诸如垫子和遮罩等技术来使数据形状统一,然后将计算结果卸下来,以优化密度高温代数的内核。然而,这种技术可能导致大量浪费计算,从而造成性能损失。本文展示了CoRA,一个高压编译器,使用户能够很容易地为压碎的抗冲器操作器和GPUPs生成有效的代码。评估了压碎式加压器和变压器模型的电解码层。我们发现,CRA(i)与操作器和变压器的手操作器实施竞争,从而造成性能损失。本文展示了CRA,使用户能够轻松过后,为以各种CPUS和GAUDUSU1号的GMAUS-GUS-Slishe-Slippe-hemodheard Speophepplipplipple AS-hedududududududududud 。

0

相关内容

编译器

编译器（Compiler），是一种计算机程序，它会将用某种编程语言写成的源代码（原始语言），转换成另一种编程语言（目标语言）。

tf_geometric — 基于TensorFlow的友好高效的图神经网络（GNN）库

tf_geometric — 基于TensorFlow的友好高效的图神经网络（GNN）库

专知会员服务

26+阅读 · 2021年8月9日

TensorFlow GNN框架tf_geometric发布0.0.58版，支持稀疏节点特征

TensorFlow GNN框架tf_geometric发布0.0.58版，支持稀疏节点特征

专知会员服务

12+阅读 · 2021年8月9日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

系列教程GNN-algorithms之六：《多核卷积拓扑图—TAGCN》

系列教程GNN-algorithms之六：《多核卷积拓扑图—TAGCN》

专知会员服务

50+阅读 · 2020年8月8日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec智能推荐

5+阅读 · 2019年7月9日

TensorFlow 2.0新特性之Ragged Tensor

TensorFlow 2.0新特性之Ragged Tensor

深度学习每日摘要

18+阅读 · 2019年4月5日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

专知

8+阅读 · 2018年11月2日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

干货| PyTorch相比TensorFlow，存在哪些自身优势？

干货| PyTorch相比TensorFlow，存在哪些自身优势？

全球人工智能

15+阅读 · 2017年10月4日

教程 | 如何从TensorFlow转入PyTorch

教程 | 如何从TensorFlow转入PyTorch

深度学习世界

38+阅读 · 2017年9月30日

Differential elimination for dynamical models via projections with applications to structural identifiability

Arxiv

0+阅读 · 2022年1月3日

BTS: An Accelerator for Bootstrappable Fully Homomorphic Encryption

Arxiv

0+阅读 · 2021年12月31日

Lifting C Semantics for Dataflow Optimization

Arxiv

0+阅读 · 2021年12月30日

SIM-SITU: A Framework for the Faithful Simulation of in-situ Workflows

Arxiv

0+阅读 · 2021年12月30日

Rollage: Efficient Rolling Average Algorithm to Estimate ARMA Models for Big Time Series Data

Arxiv

0+阅读 · 2021年12月30日

QGTC: Accelerating Quantized Graph Neural Networks via GPU Tensor Core

Arxiv

0+阅读 · 2021年12月30日

BayesPPD: An R Package for Bayesian Sample Size Determination Using the Power and Normalized Power Prior for Generalized Linear Models

Arxiv

0+阅读 · 2021年12月29日

Automated Code Optimization with E-Graphs

Arxiv

0+阅读 · 2021年12月26日

Community Aware Random Walk for Network Embedding

Arxiv

4+阅读 · 2018年2月19日

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Arxiv

10+阅读 · 2016年9月30日

VIP会员

文章信息

相关主题

相关VIP内容

tf_geometric — 基于TensorFlow的友好高效的图神经网络（GNN）库

tf_geometric — 基于TensorFlow的友好高效的图神经网络（GNN）库

专知会员服务

26+阅读 · 2021年8月9日

TensorFlow GNN框架tf_geometric发布0.0.58版，支持稀疏节点特征

TensorFlow GNN框架tf_geometric发布0.0.58版，支持稀疏节点特征

专知会员服务

12+阅读 · 2021年8月9日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

系列教程GNN-algorithms之六：《多核卷积拓扑图—TAGCN》

系列教程GNN-algorithms之六：《多核卷积拓扑图—TAGCN》

专知会员服务

50+阅读 · 2020年8月8日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec智能推荐

5+阅读 · 2019年7月9日

TensorFlow 2.0新特性之Ragged Tensor

TensorFlow 2.0新特性之Ragged Tensor

深度学习每日摘要

18+阅读 · 2019年4月5日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

专知

8+阅读 · 2018年11月2日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

干货| PyTorch相比TensorFlow，存在哪些自身优势？

干货| PyTorch相比TensorFlow，存在哪些自身优势？

全球人工智能

15+阅读 · 2017年10月4日

教程 | 如何从TensorFlow转入PyTorch

教程 | 如何从TensorFlow转入PyTorch

深度学习世界

38+阅读 · 2017年9月30日

相关论文

Differential elimination for dynamical models via projections with applications to structural identifiability

Arxiv

0+阅读 · 2022年1月3日

BTS: An Accelerator for Bootstrappable Fully Homomorphic Encryption

Arxiv

0+阅读 · 2021年12月31日

Lifting C Semantics for Dataflow Optimization

Arxiv

0+阅读 · 2021年12月30日

SIM-SITU: A Framework for the Faithful Simulation of in-situ Workflows

Arxiv

0+阅读 · 2021年12月30日

Rollage: Efficient Rolling Average Algorithm to Estimate ARMA Models for Big Time Series Data

Arxiv

0+阅读 · 2021年12月30日

QGTC: Accelerating Quantized Graph Neural Networks via GPU Tensor Core

Arxiv

0+阅读 · 2021年12月30日

BayesPPD: An R Package for Bayesian Sample Size Determination Using the Power and Normalized Power Prior for Generalized Linear Models

Arxiv

0+阅读 · 2021年12月29日

Automated Code Optimization with E-Graphs

Arxiv

0+阅读 · 2021年12月26日

Community Aware Random Walk for Network Embedding

Arxiv

4+阅读 · 2018年2月19日

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Arxiv

10+阅读 · 2016年9月30日

微信扫码咨询专知VIP会员