RISC-V矢量汇编的回溯 (Backporting RISC-V Vector assembly) - 专知论文

会员服务 ·

0

编译器 · 回溯 · GNU · 工具 · 代码 ·

2023 年 4 月 20 日

Backporting RISC-V Vector assembly

翻译：RISC-V矢量汇编的回溯

Joseph K. L. Lee,Maurice Jamieson,Nick Brown

from arxiv, Preprint of paper accepted to First International Workshop on RISC-V for HPC (2023)

Leveraging vectorisation, the ability for a CPU to apply operations to multiple elements of data concurrently, is critical for high performance workloads. However, at the time of writing, commercially available physical RISC-V hardware that provides the RISC-V vector extension (RVV) only supports version 0.7.1, which is incompatible with the latest ratified version 1.0. The challenge is that upstream compiler toolchains, such as Clang, only target the ratified v1.0 and do not support the older v0.7.1. Because v1.0 is not compatible with v0.7.1, the only way to program vectorised code is to use a vendor-provided, older compiler. In this paper we introduce the rvv-rollback tool which translates assembly code generated by the compiler using vector extension v1.0 instructions to v0.7.1. We utilise this tool to compare vectorisation performance of the vendor-provided GNU 8.4 compiler (supports v0.7.1) against LLVM 15.0 (supports only v1.0), where we found that the LLVM compiler is capable of auto-vectorising more computational kernels, and delivers greater performance than GNU in most, but not all, cases. We also tested LLVM vectorisation with vector length agnostic and specific settings, and observed cases with significant difference in performance.

翻译：利用矢量化技术，CPU同时对多个数据元素应用操作的能力对于高性能工作负载至关重要。然而，目前商用的提供RISC-V矢量扩展(RVV)的物理硬件仅支持版本0.7.1，与最新批准的版本1.0不兼容。问题在于，上游编译器工具链(如Clang)只针对批准的v1.0进行编译，不支持旧的v0.7.1。由于v1.0与v0.7.1不兼容，编写矢量化代码的唯一方法是使用供应商提供的旧编译器。在本文中，我们介绍了rvv-rollback工具，该工具将使用矢量扩展v1.0指令生成的汇编代码转换为v0.7.1。我们利用该工具比较了供应商提供的GNU 8.4编译器(支持v0.7.1)和LLVM 15.0(只支持v1.0)的矢量化性能，发现LLVM编译器能够自动矢量化更多的计算核，大多数情况下比GNU表现更好，但并非所有情况都是如此。我们还测试了具有矢量长度通用和特定设置的LLVM矢量化，观察到存在性能显著差异的情况。

0

相关内容

编译器

编译器（Compiler），是一种计算机程序，它会将用某种编程语言写成的源代码（原始语言），转换成另一种编程语言（目标语言）。

【2023新书】使用Python进行统计和数据可视化，554页pdf

【2023新书】使用Python进行统计和数据可视化，554页pdf

专知会员服务

130+阅读 · 2023年1月29日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

【Manning新书】C++并行实战，592页pdf，C++ Concurrency in Action

【Manning新书】C++并行实战，592页pdf，C++ Concurrency in Action

专知会员服务

63+阅读 · 2021年1月16日

【2020新书】数据科学与机器学习导论，220页pdf

【2020新书】数据科学与机器学习导论，220页pdf

专知会员服务

81+阅读 · 2020年9月14日

【Manning新书】微服务安全实战，616页pdf，Microservices Security in Action

【Manning新书】微服务安全实战，616页pdf，Microservices Security in Action

专知会员服务

46+阅读 · 2020年7月22日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Docker 发布 WebAssembly 支持工具预览版

Docker 发布 WebAssembly 支持工具预览版

InfoQ

0+阅读 · 2022年10月26日

这群WebAssembly大佬创业失败了：有时从 JS 迁移到 Wasm 并不值当？

这群WebAssembly大佬创业失败了：有时从 JS 迁移到 Wasm 并不值当？

InfoQ

0+阅读 · 2022年7月18日

现代编程语言需要泛型

现代编程语言需要泛型

InfoQ

0+阅读 · 2022年6月5日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Github项目推荐 | awesome-bert：BERT相关资源大列表

Github项目推荐 | awesome-bert：BERT相关资源大列表

AI研习社

27+阅读 · 2019年2月26日

WebAssembly在QQ邮箱中的一次实践

WebAssembly在QQ邮箱中的一次实践

IMWeb前端社区

13+阅读 · 2018年12月19日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

高性能多孔TiO2基介观晶体钠离子电池负极材料的研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于矩阵嵌套稀疏的高强度辐射场飞机内部电磁兼容分析方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

稀土三氢化物高压下的金属-绝缘体相变与超导相变研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀土基非晶合金的低温磁性蓄冷性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Egr3调控造血干细胞功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

miRNA-92a对Rho激酶调控的动脉粥样硬化血管重构的影响及机制

国家自然科学基金

0+阅读 · 2013年12月31日

同步辐射光谱预测方法(Bethe-Salpeter程序)的发展及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

面向数值程序安全性与鲁棒性的抽象解释技术

国家自然科学基金

0+阅读 · 2012年12月31日

大麻素WIN靶向PPARγ22522;因抗肝细胞癌增殖及其信号转导通路研究

国家自然科学基金

0+阅读 · 2011年12月31日

众核集群程序设计机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Schema First! Learn Versatile Knowledge Graph Embeddings by Capturing Semantics with MASCHInE

Arxiv

0+阅读 · 2023年6月6日

In-Context Analogical Reasoning with Pre-Trained Language Models

Arxiv

0+阅读 · 2023年6月5日

On the Coverage of Cognitive mmWave Networks with Directional Sensing and Communication

Arxiv

0+阅读 · 2023年6月2日

Optimization of SpGEMM with Risc-V vector instructions

Arxiv

0+阅读 · 2023年6月2日

Automating Pipelines of A/B Tests with Population Split Using Self-Adaptation and Machine Learning

Arxiv

0+阅读 · 2023年6月2日

Compatibility and Timing Attacks for JPEG Steganalysis

Arxiv

0+阅读 · 2023年6月2日

Multi-Robot Path Planning Combining Heuristics and Multi-Agent Reinforcement Learning

Arxiv

1+阅读 · 2023年6月2日

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Arxiv

0+阅读 · 2023年6月1日

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

Arxiv

19+阅读 · 2021年10月28日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

VIP会员

文章信息

相关主题

相关VIP内容

【2023新书】使用Python进行统计和数据可视化，554页pdf

【2023新书】使用Python进行统计和数据可视化，554页pdf

专知会员服务

130+阅读 · 2023年1月29日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

【Manning新书】C++并行实战，592页pdf，C++ Concurrency in Action

【Manning新书】C++并行实战，592页pdf，C++ Concurrency in Action

专知会员服务

63+阅读 · 2021年1月16日

【2020新书】数据科学与机器学习导论，220页pdf

【2020新书】数据科学与机器学习导论，220页pdf

专知会员服务

81+阅读 · 2020年9月14日

【Manning新书】微服务安全实战，616页pdf，Microservices Security in Action

【Manning新书】微服务安全实战，616页pdf，Microservices Security in Action

专知会员服务

46+阅读 · 2020年7月22日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

Docker 发布 WebAssembly 支持工具预览版

Docker 发布 WebAssembly 支持工具预览版

InfoQ

0+阅读 · 2022年10月26日

这群WebAssembly大佬创业失败了：有时从 JS 迁移到 Wasm 并不值当？

这群WebAssembly大佬创业失败了：有时从 JS 迁移到 Wasm 并不值当？

InfoQ

0+阅读 · 2022年7月18日

现代编程语言需要泛型

现代编程语言需要泛型

InfoQ

0+阅读 · 2022年6月5日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Github项目推荐 | awesome-bert：BERT相关资源大列表

Github项目推荐 | awesome-bert：BERT相关资源大列表

AI研习社

27+阅读 · 2019年2月26日

WebAssembly在QQ邮箱中的一次实践

WebAssembly在QQ邮箱中的一次实践

IMWeb前端社区

13+阅读 · 2018年12月19日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Schema First! Learn Versatile Knowledge Graph Embeddings by Capturing Semantics with MASCHInE

Arxiv

0+阅读 · 2023年6月6日

In-Context Analogical Reasoning with Pre-Trained Language Models

Arxiv

0+阅读 · 2023年6月5日

On the Coverage of Cognitive mmWave Networks with Directional Sensing and Communication

Arxiv

0+阅读 · 2023年6月2日

Optimization of SpGEMM with Risc-V vector instructions

Arxiv

0+阅读 · 2023年6月2日

Automating Pipelines of A/B Tests with Population Split Using Self-Adaptation and Machine Learning

Arxiv

0+阅读 · 2023年6月2日

Compatibility and Timing Attacks for JPEG Steganalysis

Arxiv

0+阅读 · 2023年6月2日

Multi-Robot Path Planning Combining Heuristics and Multi-Agent Reinforcement Learning

Arxiv

1+阅读 · 2023年6月2日

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Arxiv

0+阅读 · 2023年6月1日

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

Arxiv

19+阅读 · 2021年10月28日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

相关基金

高性能多孔TiO2基介观晶体钠离子电池负极材料的研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于矩阵嵌套稀疏的高强度辐射场飞机内部电磁兼容分析方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

稀土三氢化物高压下的金属-绝缘体相变与超导相变研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀土基非晶合金的低温磁性蓄冷性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Egr3调控造血干细胞功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

miRNA-92a对Rho激酶调控的动脉粥样硬化血管重构的影响及机制

国家自然科学基金

0+阅读 · 2013年12月31日

同步辐射光谱预测方法(Bethe-Salpeter程序)的发展及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

面向数值程序安全性与鲁棒性的抽象解释技术

国家自然科学基金

0+阅读 · 2012年12月31日

大麻素WIN靶向PPARγ22522;因抗肝细胞癌增殖及其信号转导通路研究

国家自然科学基金

0+阅读 · 2011年12月31日

众核集群程序设计机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员