计算平行计算结构平级读取UTV乘数的有效算法 (Efficient algorithms for computing a rank-revealing UTV factorization on parallel computing architectures) - 专知论文

会员服务 ·

0

分解的 · PARCO · 奇异值 · Performer · 奇异值分解 ·

2021 年 4 月 12 日

Efficient algorithms for computing a rank-revealing UTV factorization on parallel computing architectures

翻译：计算平行计算结构平级读取UTV乘数的有效算法

N. Heavner,F. D. Igual,G. Quintana-Ortí,P. G. Martinsson

from arxiv, 31 pages and 20 figures

The randomized singular value decomposition (RSVD) is by now a well established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin the RSVD, the recently proposed algorithm "randUTV" computes a FULL factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations like matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods like column pivoted QR in most high performance computational settings. In this article, optimized randUTV implementations are presented for both shared memory and distributed memory computing environments. For shared memory, randUTV is redesigned in terms of an "algorithm-by-blocks" that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard "blocked algorithm", based on a purely fork-join approach. The distributed memory implementation is based on the ScaLAPACK library. The performances of our new codes compare favorably with competing factorizations available on both shared memory and distributed memory architectures.

翻译：随机的单值分解( RSVD) 到现在, 是高效计算一个矩阵的近似单值分解( RSVD) 的成熟技术。根据支持 RSVD 的理念, 最近提议的“ randUTV” 算法“ ” 计算了一个给定矩阵的 FLL 系数, 它提供低级近似最佳差错的低级近似近似值分解。因为兰杜特V 大部分是用诸如矩阵- 矩阵乘法和未跳动 QR 乘数等通信高效操作来投射的, 它比竞合的排序分解因子化法( 如最高级性能计算设置的分流 QR ) 更快。在本篇文章中, 优化的兰杜特维执行为共享记忆和分布式记忆计算环境。对于共享记忆, 兰特维特V 是重新设计一个“ ALgorithm- by blocks”, 加上一个运行时间任务调度器, 消除数据同步点的瓶颈, 以达到标准的“ 阻塞算算算法”, 。, 以纯分解方法为基础, 和可分配的存储式的存储法基和共享的存储法基。

0

相关内容

分解的

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

神器Cobalt Strike3.13破解版

神器Cobalt Strike3.13破解版

黑白之道

12+阅读 · 2019年3月1日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

Parallel Batch-Dynamic $k$-Core Decomposition

Arxiv

0+阅读 · 2021年6月7日

Fifty Three Matrix Factorizations: A systematic approach

Fifty Three Matrix Factorizations: A systematic approach

Arxiv

0+阅读 · 2021年6月7日

MC2G: An Efficient Algorithm for Matrix Completion with Social and Item Similarity Graphs

Arxiv

0+阅读 · 2021年6月7日

On the Optimality of Backward Regression: Sparse Recovery and Subset Selection

Arxiv

0+阅读 · 2021年6月6日

Deviation Maximization for Rank-Revealing QR Factorizations

Arxiv

0+阅读 · 2021年6月6日

New complexity results and algorithms for min-max-min robust combinatorial optimization

Arxiv

0+阅读 · 2021年6月6日

Massively Parallel and Dynamic Algorithms for Minimum Size Clustering

Arxiv

0+阅读 · 2021年6月4日

String Matching with Wildcards in the Massively Parallel Computation Model

Arxiv

0+阅读 · 2021年6月4日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

奇异值分解

相关VIP内容

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

神器Cobalt Strike3.13破解版

神器Cobalt Strike3.13破解版

黑白之道

12+阅读 · 2019年3月1日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

相关论文

Parallel Batch-Dynamic $k$-Core Decomposition

Arxiv

0+阅读 · 2021年6月7日

Fifty Three Matrix Factorizations: A systematic approach

Fifty Three Matrix Factorizations: A systematic approach

Arxiv

0+阅读 · 2021年6月7日

MC2G: An Efficient Algorithm for Matrix Completion with Social and Item Similarity Graphs

Arxiv

0+阅读 · 2021年6月7日

On the Optimality of Backward Regression: Sparse Recovery and Subset Selection

Arxiv

0+阅读 · 2021年6月6日

Deviation Maximization for Rank-Revealing QR Factorizations

Arxiv

0+阅读 · 2021年6月6日

New complexity results and algorithms for min-max-min robust combinatorial optimization

Arxiv

0+阅读 · 2021年6月6日

Massively Parallel and Dynamic Algorithms for Minimum Size Clustering

Arxiv

0+阅读 · 2021年6月4日

String Matching with Wildcards in the Massively Parallel Computation Model

Arxiv

0+阅读 · 2021年6月4日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

微信扫码咨询专知VIP会员