FLOPs 作为当值线性代数代数的阻断符 (FLOPs as a Discriminant for Dense Linear Algebra Algorithms) - 专知论文

会员服务 ·

0

判别器 · 线性的 · Performer · 核化 · cache ·

2022 年 7 月 5 日

FLOPs as a Discriminant for Dense Linear Algebra Algorithms

翻译：FLOPs 作为当值线性代数代数的阻断符

Francisco López,Lars Karlsson,Paolo Bientinesi

from arxiv, 10 pages, 11 figures, 2 tables. Accepted in the 51st International Conference on Parallel Processing (ICPP'22)

Expressions that involve matrices and vectors, known as linear algebra expressions, are commonly evaluated through a sequence of invocations to highly optimised kernels provided in libraries such as BLAS and LAPACK. A sequence of kernels represents an algorithm, and in general, because of associativity, algebraic identities, and multiple kernels, one expression can be evaluated via many different algorithms. These algorithms are all mathematically equivalent (i.e., in exact arithmetic, they all compute the same result), but often differ noticeably in terms of execution time. When faced with a decision, high-level languages, libraries, and tools such as Julia, Armadillo, and Linnea choose by selecting the algorithm that minimises the FLOP count. In this paper, we test the validity of the FLOP count as a discriminant for dense linear algebra algorithms, analysing "anomalies": problem instances for which the fastest algorithm does not perform the least number of FLOPs. To do so, we focused on relatively simple expressions and analysed when and why anomalies occurred. We found that anomalies exist and tend to cluster into large contiguous regions. For one expression anomalies were rare, whereas for the other they were abundant. We conclude that FLOPs is not a sufficiently dependable discriminant even when building algorithms with highly optimised kernels. Plus, most of the anomalies remained as such even after filtering out the inter-kernel cache effects. We conjecture that combining FLOP counts with kernel performance models will significantly improve our ability to choose optimal algorithms.

翻译：涉及矩阵和矢量表达式的表达式( 被称为线性代数表达式) 通常通过一系列的引用来评价, 包括BLAS 和 LAPACK 等图书馆提供的高度优化的内核。一个内核序列代表一种算法, 一般来说, 因为关联性、代数特性和多个内核, 一种表达式可以通过多种不同的算法来评价。这些算法都是数学等同的( 即精确算术中, 它们都计算出相同的结果), 但在执行时间方面往往有明显差异。当面对一个决定时, 高层次的语言、图书馆和工具, 如 Julia、 Armadillo 和 Linnea 等, 选择一种算法, 以最小化 FLOP 计数。在本文中, 我们测试 FLOP 计数的有效性, 分析“ 异常值 ” : 最快速的算法并不产生最低的内值效果。如此, 我们注重相对简单的表达式, 也就是, 我们发现, 最稳定的表达式会变得非常稳定。

0

相关内容

判别器

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

多元多项式环的Hermite性质与多项式矩阵的分解

国家自然科学基金

0+阅读 · 2014年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

参数复杂性、SAT求解器和树宽度

国家自然科学基金

0+阅读 · 2012年12月31日

异构多核并行机上线性代数方程组的快速算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

图像处理的各向异性演化格子波尔兹曼模型及快速算法

国家自然科学基金

0+阅读 · 2011年12月31日

矩阵分解的低延迟并行算法

国家自然科学基金

0+阅读 · 2009年12月31日

季风槽内热带气旋生成的机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

多元逼近的贪婪算法与量子算法

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Causal Bandits for Linear Structural Equation Models

Arxiv

0+阅读 · 2022年8月26日

Shallow decision trees for explainable $k$-means clustering

Shallow decision trees for explainable $k$-means clustering

Arxiv

0+阅读 · 2022年8月26日

A Framework for Inherently Interpretable Optimization Models

Arxiv

0+阅读 · 2022年8月26日

Nonparametric Gaussian Mixture Models for the Multi-Armed Bandit

Arxiv

0+阅读 · 2022年8月25日

Efficient Truncated Linear Regression with Unknown Noise Variance

Efficient Truncated Linear Regression with Unknown Noise Variance

Arxiv

0+阅读 · 2022年8月25日

The Informativeness of K -Means for Learning Mixture Models

Arxiv

0+阅读 · 2022年8月25日

Nonparametric adaptive control and prediction: theory and randomized algorithms

Arxiv

0+阅读 · 2022年8月25日

Dual Extrapolation for Sparse Generalized Linear Models

Arxiv

0+阅读 · 2022年8月24日

Multi-Task Transformer with uncertainty modelling for Face Based Affective Computing

Arxiv

0+阅读 · 2022年8月24日

Multinomial Cluster-Weighted Models for High-Dimensional Data

Arxiv

0+阅读 · 2022年8月23日

VIP会员

文章信息

相关主题

相关VIP内容

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Causal Bandits for Linear Structural Equation Models

Arxiv

0+阅读 · 2022年8月26日

Shallow decision trees for explainable $k$-means clustering

Shallow decision trees for explainable $k$-means clustering

Arxiv

0+阅读 · 2022年8月26日

A Framework for Inherently Interpretable Optimization Models

Arxiv

0+阅读 · 2022年8月26日

Nonparametric Gaussian Mixture Models for the Multi-Armed Bandit

Arxiv

0+阅读 · 2022年8月25日

Efficient Truncated Linear Regression with Unknown Noise Variance

Efficient Truncated Linear Regression with Unknown Noise Variance

Arxiv

0+阅读 · 2022年8月25日

The Informativeness of K -Means for Learning Mixture Models

Arxiv

0+阅读 · 2022年8月25日

Nonparametric adaptive control and prediction: theory and randomized algorithms

Arxiv

0+阅读 · 2022年8月25日

Dual Extrapolation for Sparse Generalized Linear Models

Arxiv

0+阅读 · 2022年8月24日

Multi-Task Transformer with uncertainty modelling for Face Based Affective Computing

Arxiv

0+阅读 · 2022年8月24日

Multinomial Cluster-Weighted Models for High-Dimensional Data

Arxiv

0+阅读 · 2022年8月23日

相关基金

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

多元多项式环的Hermite性质与多项式矩阵的分解

国家自然科学基金

0+阅读 · 2014年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

参数复杂性、SAT求解器和树宽度

国家自然科学基金

0+阅读 · 2012年12月31日

异构多核并行机上线性代数方程组的快速算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

图像处理的各向异性演化格子波尔兹曼模型及快速算法

国家自然科学基金

0+阅读 · 2011年12月31日

矩阵分解的低延迟并行算法

国家自然科学基金

0+阅读 · 2009年12月31日

季风槽内热带气旋生成的机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

多元逼近的贪婪算法与量子算法

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员