A Statically and Dynamically Scalable Soft GPGPU - 专知论文

会员服务 ·

0

SOFT · GPGPU · FPGA · 值域 · Performer ·

2024 年 1 月 8 日

A Statically and Dynamically Scalable Soft GPGPU

翻译：暂无翻译

Martin Langhammer,George A. Constantinides

Current soft processor architectures for FPGAs do not utilize the potential of the massive parallelism available. FPGAs now support many thousands of embedded floating point operators, and have similar computational densities to GPGPUs. Several soft GPGPU or SIMT processors have been published, but the reported large areas and modest Fmax makes their widespread use unlikely for commercial designs. In this paper we take an alternative approach, building the soft GPU microarchitecture around the FPGA resource mix available. We demonstrate a statically scalable soft GPGPU processor (where both parameters and feature set can be determined at configuration time) that always closes timing at the peak speed of the slowest embedded component in the FPGA (DSP or hard memory), with a completely unconstrained compile into a current Intel Agilex FPGA. We also show dynamic scalability, where a subset of the thread space can be specified on an instruction-by-instruction basis. For one example core type, we show a logic range -- depending on the configuration -- of 4k to 10k ALMs, along with 24 to 32 DSP Blocks, and 50 to 250 M20K memories. All of these instances close timing at 771 MHz, a performance level limited only by the DSP Blocks. We describe our methodology for reliably achieving this clock rate by matching the processor pipeline structure to the physical structure of the FPGA fabric. We also benchmark several algorithms across a range of data sizes, and compare to a commercial soft RISC processor.

翻译：暂无翻译

0

相关内容

SOFT

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Hierarchical Matrix Factorization for Interpretable Collaborative Filtering

Arxiv

0+阅读 · 2024年2月21日

Unifying Image Processing as Visual Prompting Question Answering

Arxiv

0+阅读 · 2024年2月21日

Random Graph Set and Evidence Pattern Reasoning Model

Arxiv

0+阅读 · 2024年2月20日

Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Arxiv

15+阅读 · 2022年11月29日

Introduction to Online Convex Optimization

Arxiv

23+阅读 · 2021年12月19日

VIP会员

文章信息

相关主题

相关VIP内容

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

相关论文

Hierarchical Matrix Factorization for Interpretable Collaborative Filtering

Arxiv

0+阅读 · 2024年2月21日

Unifying Image Processing as Visual Prompting Question Answering

Arxiv

0+阅读 · 2024年2月21日

Random Graph Set and Evidence Pattern Reasoning Model

Arxiv

0+阅读 · 2024年2月20日

Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Arxiv

15+阅读 · 2022年11月29日

Introduction to Online Convex Optimization

Arxiv

23+阅读 · 2021年12月19日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员