矢量和性能便携式快筒 (Vectorized and performance-portable Quicksort) - 专知论文

会员服务 ·

0

向量化 · 知识 (knowledge) · 情景 · Seven · Pivotal（公司） ·

2022 年 5 月 12 日

Vectorized and performance-portable Quicksort

翻译：矢量和性能便携式快筒

Mark Blacher,Joachim Giesen,Peter Sanders,Jan Wassenberg

from arxiv, 21 pages

Recent works showed that implementations of Quicksort using vector CPU instructions can outperform the non-vectorized algorithms in widespread use. However, these implementations are typically single-threaded, implemented for a particular instruction set, and restricted to a small set of key types. We lift these three restrictions: our proposed 'vqsort' algorithm integrates into the state-of-the-art parallel sorter 'ips4o', with a geometric mean speedup of 1.59. The same implementation works on seven instruction sets (including SVE and RISC-V V) across four platforms. It also supports floating-point and 16-128 bit integer keys. To the best of our knowledge, this is the fastest sort for non-tuple keys on CPUs, up to 20 times as fast as the sorting algorithms implemented in standard libraries. This paper focuses on the practical engineering aspects enabling the speed and portability, which we have not yet seen demonstrated for a Quicksort implementation. Furthermore, we introduce compact and transpose-free sorting networks for in-register sorting of small arrays, and a vector-friendly pivot sampling strategy that is robust against adversarial input.

翻译：最近的工作显示,使用矢量 CPU 的Quicksort 使用矢量 CPU 指令的实施可以超越广泛使用的非矢量算法。但是, 这些执行通常都是单向值, 用于特定的指令集, 并限于一小组关键类型。我们取消了这三项限制: 我们提议的“ vqsort ” 算法将“ ips4o” 整合到最先进的平行排序器“ ips4o” 中, 其速度为1.59。四个平台的七个指令组( 包括 SVE 和 RISC- V V) 的同一执行工作。它还支持浮动点和 16- 128 位整形键。根据我们的知识, 这是CPUs 上非列键的最快类型, 最多为标准库中执行排序算法的20倍。本文侧重于能够速度和可移动性的实际工程方面, 我们尚未看到用于快速执行的7个指令组( 包括 SVEVE 和 RISC-V V) 。此外, 我们引入了不易位化和不设位的网络的网络网络网络, 以对抗对等式战略进行稳的试样的矩阵。

0

相关内容

向量化

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

非小细胞肺癌患者血浆可溶性TRAIL对循环ALDH1+肿瘤干细胞样细胞的影响及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

瘢痕疙瘩中DAB-1抑制E3连接酶SIAH1对TIEG1泛素化介导TGF-β/Smads信号通路的研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-124靶向TRAF6在骨肉瘤中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

激光光镊技术对α-synuclein蛋白折叠与聚集动力学研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于集群OFDM的低功耗电力线通信收发端设计

国家自然科学基金

0+阅读 · 2013年12月31日

RGM与neogenin信号调控应激性精神障碍-PTSD杏仁核、海马神经细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于HPLC-MS质控下的化瘀消癥杀胚中药对人输卵管妊娠滋养细胞影响的研究

国家自然科学基金

0+阅读 · 2009年12月31日

生物可降解性多模态纳米微粒构建与TIMP-2、Endostatin联合靶向转运抑制动脉粥样硬化易损斑块血管发生的研究

国家自然科学基金

0+阅读 · 2009年12月31日

瘢痕疙瘩中TIEG1对Smad7转录调控的研究

国家自然科学基金

0+阅读 · 2009年12月31日

辐射诱导的人鼻咽癌耐药细胞株(CNE1/R)中PECAM-1信号传导机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Flexible Bayesian Product Mixture Models for Vector Autoregressions

Flexible Bayesian Product Mixture Models for Vector Autoregressions

Arxiv

0+阅读 · 2022年7月1日

"Communication Is a Scarce Resource!'': A Summary of CHASE'22 Conference Discussions

Arxiv

0+阅读 · 2022年6月30日

Adaptive Cut Selection in Mixed-Integer Linear Programming

Adaptive Cut Selection in Mixed-Integer Linear Programming

Arxiv

0+阅读 · 2022年6月30日

Learnable Model-Driven Performance Prediction and Optimization for Imperfect MIMO System: Framework and Application

Arxiv

0+阅读 · 2022年6月30日

New Progress in Classic Area: Polynomial Root-squaring and Root-finding

Arxiv

0+阅读 · 2022年6月30日

An Embedding Framework for the Design and Analysis of Consistent Polyhedral Surrogates

Arxiv

0+阅读 · 2022年6月29日

Assessing Intel's Memory Bandwidth Allocation for resource limitation in real-time systems

Assessing Intel's Memory Bandwidth Allocation for resource limitation in real-time systems

Arxiv

0+阅读 · 2022年6月29日

Perspective (In)consistency of Paint by Text

Arxiv

0+阅读 · 2022年6月27日

Trustworthy AI: A Computational Perspective

Arxiv

12+阅读 · 2021年8月19日

A Survey on Edge Computing Systems and Tools

Arxiv

36+阅读 · 2019年11月7日

VIP会员

文章信息

相关主题

知识 (knowledge)

Pivotal（公司）

相关VIP内容

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

NeurIPS 2025 | 自动化所新作速览（一）

大型语言模型（LLM）赋能的知识图谱构建：综述

NeurIPS 2025 | 自动化所新作速览（二）

领域特定文本分类中的预训练语言模型新进展：系统综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Flexible Bayesian Product Mixture Models for Vector Autoregressions

Flexible Bayesian Product Mixture Models for Vector Autoregressions

Arxiv

0+阅读 · 2022年7月1日

"Communication Is a Scarce Resource!'': A Summary of CHASE'22 Conference Discussions

Arxiv

0+阅读 · 2022年6月30日

Adaptive Cut Selection in Mixed-Integer Linear Programming

Adaptive Cut Selection in Mixed-Integer Linear Programming

Arxiv

0+阅读 · 2022年6月30日

Learnable Model-Driven Performance Prediction and Optimization for Imperfect MIMO System: Framework and Application

Arxiv

0+阅读 · 2022年6月30日

New Progress in Classic Area: Polynomial Root-squaring and Root-finding

Arxiv

0+阅读 · 2022年6月30日

An Embedding Framework for the Design and Analysis of Consistent Polyhedral Surrogates

Arxiv

0+阅读 · 2022年6月29日

Assessing Intel's Memory Bandwidth Allocation for resource limitation in real-time systems

Assessing Intel's Memory Bandwidth Allocation for resource limitation in real-time systems

Arxiv

0+阅读 · 2022年6月29日

Perspective (In)consistency of Paint by Text

Arxiv

0+阅读 · 2022年6月27日

Trustworthy AI: A Computational Perspective

Arxiv

12+阅读 · 2021年8月19日

A Survey on Edge Computing Systems and Tools

Arxiv

36+阅读 · 2019年11月7日

相关基金

非小细胞肺癌患者血浆可溶性TRAIL对循环ALDH1+肿瘤干细胞样细胞的影响及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

瘢痕疙瘩中DAB-1抑制E3连接酶SIAH1对TIEG1泛素化介导TGF-β/Smads信号通路的研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-124靶向TRAF6在骨肉瘤中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

激光光镊技术对α-synuclein蛋白折叠与聚集动力学研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于集群OFDM的低功耗电力线通信收发端设计

国家自然科学基金

0+阅读 · 2013年12月31日

RGM与neogenin信号调控应激性精神障碍-PTSD杏仁核、海马神经细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于HPLC-MS质控下的化瘀消癥杀胚中药对人输卵管妊娠滋养细胞影响的研究

国家自然科学基金

0+阅读 · 2009年12月31日

生物可降解性多模态纳米微粒构建与TIMP-2、Endostatin联合靶向转运抑制动脉粥样硬化易损斑块血管发生的研究

国家自然科学基金

0+阅读 · 2009年12月31日

瘢痕疙瘩中TIEG1对Smad7转录调控的研究

国家自然科学基金

0+阅读 · 2009年12月31日

辐射诱导的人鼻咽癌耐药细胞株(CNE1/R)中PECAM-1信号传导机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员