LLAMA: 低度记忆存取摘要 (LLAMA: The Low Level Abstraction For Memory Access) - 专知论文

会员服务 ·

0

Performer · Extensibility · 层 · 面向服务的架构（SOA） · REST ·

2021 年 6 月 8 日

LLAMA: The Low Level Abstraction For Memory Access

翻译：LLAMA: 低度记忆存取摘要

Bernhard Manfred Gruber,Guilherme Amadio,Jakob Blomer,Alexander Matthes,René Widera,Michael Bussmann

from arxiv, 32 pages, 7 figures, 10 listings

The performance gap between CPU and memory widens continuously. Choosing the best memory layout for each hardware architecture is increasingly important as more and more programs become memory bound. For portable codes that run across heterogeneous hardware architectures, the choice of the memory layout for data structures is therefore ideally decoupled from the rest of a program. This can be accomplished via a zero-runtime-overhead abstraction layer, underneath which memory layouts can be freely exchanged. We present the C++ library LLAMA, which provides such a data structure abstraction layer with example implementations for multidimensional arrays of nested, structured data. LLAMA provides fully C++ compliant methods for defining and switching custom memory layouts for user-defined data types. Providing two close-to-life examples, we show that the LLAMA-generated AoS (Array of Struct) and SoA (Struct of Array) layouts produce identical code with the same performance characteristics as manually written data structures. LLAMA's layout-aware copy routines can significantly speed up transfer and reshuffling of data between layouts compared with naive element-wise copying. The library is fully extensible with third-party allocators and allows users to support their own memory layouts with custom mappings.

翻译：CPU 和内存之间的性能差距会持续扩大。选择每个硬件架构的最佳内存布局越来越重要, 因为越来越多的程序将内存捆绑起来。对于跨多种硬件架构运行的便携式代码, 因此数据结构的内存布局的选择最好与程序的其余部分脱钩。这可以通过零运行时间覆盖式的抽象层实现, 在这种层下可以自由交换内存布局。我们展示了 C++ 图书馆 LLAMAMA, 它提供了这样的数据结构抽象层, 并且为嵌套、结构化数据的多维数阵列提供了示例。 LLAMAMA 提供了定义和转换用户定义数据类型自定义的自定义内存布局的完全C+兼容方法。我们提供了两个近寿命示例。我们显示, LMA 生成的 AoS (Struct Array) 和 SoA (Array) 版布局可以产生与手动数据结构相同的性能特性相同的代码。 LLNAMA 的布局认知复制程序可以大大加快用户之间的传输和重新配置数据布局间数据布局的功能布局与所有用户之间的自动平版图支持。

0

相关内容

Performer

IJCAI2020接受论文列表，592篇论文pdf都在这了！

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

6+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Vision Xformers: Efficient Attention for Image Classification

Vision Xformers: Efficient Attention for Image Classification

Arxiv

0+阅读 · 2021年8月3日

On the Success Probability of Three Detectors for the Box-Constrained Integer Linear Model

Arxiv

0+阅读 · 2021年8月1日

Online Spatio-temporal Calibration of Tightly-coupled Ultrawideband-aided Inertial Localization

Arxiv

0+阅读 · 2021年7月31日

Fast direct access to variable length codes

Arxiv

0+阅读 · 2021年7月30日

FATNN: Fast and Accurate Ternary Neural Networks

Arxiv

0+阅读 · 2021年7月29日

Conservative Objective Models for Effective Offline Model-Based Optimization

Arxiv

4+阅读 · 2021年7月14日

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking

Arxiv

7+阅读 · 2021年5月12日

UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering

Arxiv

5+阅读 · 2019年4月2日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Rapid Customization for Event Extraction

Rapid Customization for Event Extraction

Arxiv

7+阅读 · 2018年9月20日

VIP会员

文章信息

相关主题

面向服务的架构（SOA）

相关VIP内容

IJCAI2020接受论文列表，592篇论文pdf都在这了！

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

6+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Vision Xformers: Efficient Attention for Image Classification

Vision Xformers: Efficient Attention for Image Classification

Arxiv

0+阅读 · 2021年8月3日

On the Success Probability of Three Detectors for the Box-Constrained Integer Linear Model

Arxiv

0+阅读 · 2021年8月1日

Online Spatio-temporal Calibration of Tightly-coupled Ultrawideband-aided Inertial Localization

Arxiv

0+阅读 · 2021年7月31日

Fast direct access to variable length codes

Arxiv

0+阅读 · 2021年7月30日

FATNN: Fast and Accurate Ternary Neural Networks

Arxiv

0+阅读 · 2021年7月29日

Conservative Objective Models for Effective Offline Model-Based Optimization

Arxiv

4+阅读 · 2021年7月14日

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking

Arxiv

7+阅读 · 2021年5月12日

UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering

Arxiv

5+阅读 · 2019年4月2日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Rapid Customization for Event Extraction

Rapid Customization for Event Extraction

Arxiv

7+阅读 · 2018年9月20日

微信扫码咨询专知VIP会员