利用SIMD指示有效计算职位人口计数 (Efficient Computation of Positional Population Counts Using SIMD Instructions) - 专知论文

会员服务 ·

0

独热 · 统计量 · 比特 · Bioinformatics · 泛化理论 ·

2021 年 3 月 17 日

Efficient Computation of Positional Population Counts Using SIMD Instructions

翻译：利用SIMD指示有效计算职位人口计数

Marcus D. R. Klarqvist,Wojciech Muła,Daniel Lemire

In several fields such as statistics, machine learning, and bioinformatics, categorical variables are frequently represented as one-hot encoded vectors. For example, given 8 distinct values, we map each value to a byte where only a single bit has been set. We are motivated to quickly compute statistics over such encodings. Given a stream of k-bit words, we seek to compute k distinct sums corresponding to bit values at indexes 0, 1, 2, ..., k-1. If the k-bit words are one-hot encoded then the sums correspond to a frequency histogram. This multiple-sum problem is a generalization of the population-count problem where we seek the sum of all bit values. Accordingly, we refer to the multiple-sum problem as a positional population-count. Using SIMD (Single Instruction, Multiple Data) instructions from recent Intel processors, we describe algorithms for computing the 16-bit position population count using less than half of a CPU cycle per 16-bit word. Our best approach uses up to 400 times fewer instructions and is up to 50 times faster than baseline code using only regular (non-SIMD) instructions, for sufficiently large inputs.

翻译：在统计、机器学习和生物信息学等多个领域,绝对变量通常以单热编码矢量表示。例如,根据8个不同的值,我们将每个值都映射到仅设定了一位数的字节。我们有动力快速计算这些编码的统计。在 k-bit 单词的流中,我们试图计算与指数0、1、2、...、 k-1 的比特值相对应的 k 单数。如果 k-bit 单词是一热编码,然后数数对应频率直方图。这个多和问题是一个人口计数问题的一般化问题,我们在这里寻找所有位数的总和。因此,我们把多和问题称为位置人口计。使用最近Intel 处理器的 SIMD (单调、多数据) 指令,我们描述计算16位人口数的算法,使用每16位单词的CPU周期不到一半。我们的最佳方法使用400倍的指令,并且比基线码快50倍,仅使用正常输入(非SIMD) 。

0

相关内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

最新《序列预测问题导论》教程，212页ppt

最新《序列预测问题导论》教程，212页ppt

专知会员服务

86+阅读 · 2020年8月22日

【干货书】Python程序员编程，810页pdf，Python® for Programmers

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知会员服务

27+阅读 · 2020年4月3日

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

专知会员服务

107+阅读 · 2020年2月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

神器Cobalt Strike3.13破解版

神器Cobalt Strike3.13破解版

黑白之道

12+阅读 · 2019年3月1日

时序数据异常检测工具/数据集大列表

时序数据异常检测工具/数据集大列表

极市平台

65+阅读 · 2019年2月23日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

误差反向传播——RNN

误差反向传播——RNN

统计学习与视觉计算组

18+阅读 · 2018年9月6日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Subfield Algorithms for Ideal- and Module-SVP Based on the Decomposition Group

Arxiv

0+阅读 · 2021年5月7日

Variety Evasive Subspace Families

Arxiv

0+阅读 · 2021年5月6日

Migrating Client Code without Change Examples

Arxiv

0+阅读 · 2021年5月6日

Parameter Priors for Directed Acyclic Graphical Models and the Characterization of Several Probability Distributions

Arxiv

0+阅读 · 2021年5月5日

A Note on Indexing Point Sets for Approximate Bottleneck Distance Queries

Arxiv

0+阅读 · 2021年5月5日

Improved Singleton bound on insertion-deletion codes and optimal constructions

Arxiv

0+阅读 · 2021年5月5日

Dynamic Enumeration of Similarity Joins

Arxiv

0+阅读 · 2021年5月5日

Verifiable Computing Using Computation Fingerprints Within FHE

Arxiv

0+阅读 · 2021年5月4日

Finding Triangles or Independent Sets

Arxiv

0+阅读 · 2021年5月4日

EBIC.JL -- an Efficient Implementation of Evolutionary Biclustering Algorithm in Julia

Arxiv

0+阅读 · 2021年5月3日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

最新《序列预测问题导论》教程，212页ppt

最新《序列预测问题导论》教程，212页ppt

专知会员服务

86+阅读 · 2020年8月22日

【干货书】Python程序员编程，810页pdf，Python® for Programmers

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知会员服务

27+阅读 · 2020年4月3日

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

专知会员服务

107+阅读 · 2020年2月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

神器Cobalt Strike3.13破解版

神器Cobalt Strike3.13破解版

黑白之道

12+阅读 · 2019年3月1日

时序数据异常检测工具/数据集大列表

时序数据异常检测工具/数据集大列表

极市平台

65+阅读 · 2019年2月23日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

误差反向传播——RNN

误差反向传播——RNN

统计学习与视觉计算组

18+阅读 · 2018年9月6日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Subfield Algorithms for Ideal- and Module-SVP Based on the Decomposition Group

Arxiv

0+阅读 · 2021年5月7日

Variety Evasive Subspace Families

Arxiv

0+阅读 · 2021年5月6日

Migrating Client Code without Change Examples

Arxiv

0+阅读 · 2021年5月6日

Parameter Priors for Directed Acyclic Graphical Models and the Characterization of Several Probability Distributions

Arxiv

0+阅读 · 2021年5月5日

A Note on Indexing Point Sets for Approximate Bottleneck Distance Queries

Arxiv

0+阅读 · 2021年5月5日

Improved Singleton bound on insertion-deletion codes and optimal constructions

Arxiv

0+阅读 · 2021年5月5日

Dynamic Enumeration of Similarity Joins

Arxiv

0+阅读 · 2021年5月5日

Verifiable Computing Using Computation Fingerprints Within FHE

Arxiv

0+阅读 · 2021年5月4日

Finding Triangles or Independent Sets

Arxiv

0+阅读 · 2021年5月4日

EBIC.JL -- an Efficient Implementation of Evolutionary Biclustering Algorithm in Julia

Arxiv

0+阅读 · 2021年5月3日

微信扫码咨询专知VIP会员