GPUs 上的加权随机随机抽样 (Weighted Random Sampling on GPUs) - 专知论文

会员服务 ·

0

随机采样 · Weight · Aliasing · 样本 · 中央处理器 (CPU) ·

2022 年 5 月 23 日

Weighted Random Sampling on GPUs

翻译：GPUs 上的加权随机随机抽样

Hans-Peter Lehmann,Lorenz Hübschle-Schneider,Peter Sanders

An alias table is a data structure that allows for efficiently drawing weighted random samples in constant time and can be constructed in linear time. The PSA algorithm by H\"ubschle-Schneider and Sanders is able to construct alias tables in parallel on the CPU. In this report, we transfer the PSA algorithm to the GPU. Our construction algorithm achieves a speedup of 17 on a consumer GPU in comparison to the PSA method on a 16-core high-end desktop CPU. For sampling, we achieve an up to 24 times higher throughput. Both operations also require several times less energy than on the CPU. Adaptations helping to achieve this include changing memory access patterns to do coalesced access. Where this is not possible, we first copy data to the faster shared memory using coalesced access. We also enhance a generalization of binary search enabling to search for a range of items in parallel. Besides naive sampling, we also give improved batched sampling algorithms.

翻译：化名表是一个数据结构,它允许在固定时间高效地抽取加权随机样本,并且可以以线性时间构建。 H\"ubschle-Schneider和Sanders的PSA算法能够平行地在CPU上建立别名表。我们在本报告中将PSA算法转到了GPU。我们的建设算法在消费性GPU上实现了17个加速,在16个核心高端桌面CPU上与PSA方法相比较。在取样方面,我们达到了最高24倍的通过量。两种操作也需要比CPU多几倍的能量。为了实现这一点,我们所需要的适应包括改变内存访问模式,以便进行联结访问。在无法这样做的情况下,我们首先将数据复制到使用煤化访问的更快的共享记忆中。我们还加强了能够同时搜索一系列项目的二元搜索的常规化。除了天性取样之外,我们还改进了分批抽样算法。

0

相关内容

随机采样

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

斯坦福CS246《大数据挖掘》2021课程开始了！Jure Leskovec大牛主讲，附课程PPT下载

斯坦福CS246《大数据挖掘》2021课程开始了！Jure Leskovec大牛主讲，附课程PPT下载

专知会员服务

61+阅读 · 2021年5月10日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大神一年100篇论文

大神一年100篇论文

CreateAMind

15+阅读 · 2018年12月31日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

基于荧光分子@纳米材料的荧光淬灭对重金属离子的探测

国家自然科学基金

0+阅读 · 2013年12月31日

强流质子束与固体靶相互作用的数值模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

时间分辨及二维相关红外光谱在快离子导体离子迁移微观动力学中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向气动CFD非线性求解的GPU/CPU混合并行JFNK算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

二维色谱—质谱技术在金属化合物诱导蛋白表达差异研究中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

CPU/GPU协同并行计算在第一性原理电子输运模拟中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

矩阵分解的低延迟并行算法

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

TGF-β28608;活Myocardin家族诱导骨髓间充质干细胞分化的研究

国家自然科学基金

0+阅读 · 2008年12月31日

三维片上网络（3D NoC）关键技术研究

国家自然科学基金

1+阅读 · 2008年12月31日

On the Age of Information for AMP based Grant-Free Random Access

Arxiv

0+阅读 · 2022年7月11日

Fast Density-Peaks Clustering: Multicore-based Parallelization Approach

Arxiv

0+阅读 · 2022年7月11日

How to Train Your MAML to Excel in Few-Shot Classification

Arxiv

0+阅读 · 2022年7月11日

Deep Active Learning for Regression Using $ε$-weighted Hybrid Query Strategy

Arxiv

0+阅读 · 2022年7月10日

Attention and Self-Attention in Random Forests

Arxiv

0+阅读 · 2022年7月9日

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

Arxiv

0+阅读 · 2022年7月8日

Identification based on random coding

Arxiv

0+阅读 · 2022年7月7日

Run Time Analysis for Random Local Search on Generalized Majority Functions

Arxiv

0+阅读 · 2022年7月7日

Backpropagation on Dynamical Networks

Arxiv

0+阅读 · 2022年7月7日

On the instrumental variable estimation with many weak and invalid instruments

Arxiv

0+阅读 · 2022年7月7日

VIP会员

文章信息

相关主题

中央处理器 (CPU)

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

斯坦福CS246《大数据挖掘》2021课程开始了！Jure Leskovec大牛主讲，附课程PPT下载

斯坦福CS246《大数据挖掘》2021课程开始了！Jure Leskovec大牛主讲，附课程PPT下载

专知会员服务

61+阅读 · 2021年5月10日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大神一年100篇论文

大神一年100篇论文

CreateAMind

15+阅读 · 2018年12月31日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

On the Age of Information for AMP based Grant-Free Random Access

Arxiv

0+阅读 · 2022年7月11日

Fast Density-Peaks Clustering: Multicore-based Parallelization Approach

Arxiv

0+阅读 · 2022年7月11日

How to Train Your MAML to Excel in Few-Shot Classification

Arxiv

0+阅读 · 2022年7月11日

Deep Active Learning for Regression Using $ε$-weighted Hybrid Query Strategy

Arxiv

0+阅读 · 2022年7月10日

Attention and Self-Attention in Random Forests

Arxiv

0+阅读 · 2022年7月9日

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

Arxiv

0+阅读 · 2022年7月8日

Identification based on random coding

Arxiv

0+阅读 · 2022年7月7日

Run Time Analysis for Random Local Search on Generalized Majority Functions

Arxiv

0+阅读 · 2022年7月7日

Backpropagation on Dynamical Networks

Arxiv

0+阅读 · 2022年7月7日

On the instrumental variable estimation with many weak and invalid instruments

Arxiv

0+阅读 · 2022年7月7日

相关基金

基于荧光分子@纳米材料的荧光淬灭对重金属离子的探测

国家自然科学基金

0+阅读 · 2013年12月31日

强流质子束与固体靶相互作用的数值模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

时间分辨及二维相关红外光谱在快离子导体离子迁移微观动力学中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向气动CFD非线性求解的GPU/CPU混合并行JFNK算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

二维色谱—质谱技术在金属化合物诱导蛋白表达差异研究中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

CPU/GPU协同并行计算在第一性原理电子输运模拟中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

矩阵分解的低延迟并行算法

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

TGF-β28608;活Myocardin家族诱导骨髓间充质干细胞分化的研究

国家自然科学基金

0+阅读 · 2008年12月31日

三维片上网络（3D NoC）关键技术研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员