Fantasy：基于GPUDirect Async的GPU集群高效大规模向量搜索系统 (Fantasy: Efficient Large-scale Vector Search on GPU Clusters with GPUDirect Async) - 专知论文

会员服务 ·

0

GPU · 搜索 · 系统 · 内存 · 存储 ·

Fantasy: Efficient Large-scale Vector Search on GPU Clusters with GPUDirect Async

翻译：Fantasy：基于GPUDirect Async的GPU集群高效大规模向量搜索系统

Yi Liu,Chen Qian

Vector similarity search has become a critical component in AI-driven applications such as large language models (LLMs). To achieve high recall and low latency, GPUs are utilized to exploit massive parallelism for faster query processing. However, as the number of vectors continues to grow, the graph size quickly exceeds the memory capacity of a single GPU, making it infeasible to store and process the entire index on a single GPU. Recent work uses CPU-GPU architectures to keep vectors in CPU memory or SSDs, but the loading step stalls GPU computation. We present Fantasy, an efficient system that pipelines vector search and data transfer in a GPU cluster with GPUDirect Async. Fantasy overlaps computation and network communication to significantly improve search throughput for large graphs and deliver large query batch sizes.

翻译：向量相似性搜索已成为人工智能驱动应用（如大型语言模型LLMs）中的关键组件。为实现高召回率与低延迟，通常利用GPU的大规模并行性来加速查询处理。然而，随着向量数量持续增长，图规模迅速超出单GPU内存容量，导致无法在单GPU上存储和处理完整索引。近期研究采用CPU-GPU架构将向量存储于CPU内存或SSD中，但数据加载步骤会阻塞GPU计算。本文提出Fantasy系统，该系统通过GPUDirect Async技术在GPU集群中实现向量搜索与数据传输的流水线化。Fantasy通过重叠计算与网络通信，显著提升大规模图搜索的吞吐量，并支持大容量查询批处理。

0

相关内容

GPU

【ICML2025】QuRe：通过困难负样本采样实现查询相关的组合图像检索

【ICML2025】QuRe：通过困难负样本采样实现查询相关的组合图像检索

专知会员服务

7+阅读 · 7月20日

[ICCV2025]EAMamba：面向图像恢复的高效全能视觉状态空间模型

[ICCV2025]EAMamba：面向图像恢复的高效全能视觉状态空间模型

专知会员服务

5+阅读 · 7月1日

【KDD2023】协同过滤的高效联合超参数和架构搜索

【KDD2023】协同过滤的高效联合超参数和架构搜索

专知会员服务

23+阅读 · 2023年7月23日

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

专知会员服务

44+阅读 · 2020年4月30日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

专知

19+阅读 · 2021年3月28日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知

11+阅读 · 2020年8月28日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知

15+阅读 · 2020年7月23日

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

专知

11+阅读 · 2020年3月17日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

基于DASH的交互式三维视频系统建模

国家自然科学基金

1+阅读 · 2015年12月31日

基于概率计算的大规模MIMO检测方法

国家自然科学基金

1+阅读 · 2015年12月31日

基于Spark的大图数据最优子模式匹配查询方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

SDN数据平面中大规模流表的高性能查找方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

高维复杂结构数据降维

国家自然科学基金

10+阅读 · 2014年12月31日

SAFT: Structure-Aware Fine-Tuning of LLMs for AMR-to-Text Generation

Arxiv

0+阅读 · 12月10日

OpenSQP: A Reconfigurable Open-Source SQP Algorithm in Python for Nonlinear Optimization

Arxiv

0+阅读 · 12月5日

Lean Unet: A Compact Model for Image Segmentation

Arxiv

0+阅读 · 12月3日

FIGROTD: A Friendly-to-Handle Dataset for Image Guided Retrieval with Optional Text

Arxiv

0+阅读 · 11月27日

STATIC : Surface Temporal Affine for TIme Consistency in Video Monocular Depth Estimation

Arxiv

0+阅读 · 11月13日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2025】QuRe：通过困难负样本采样实现查询相关的组合图像检索

【ICML2025】QuRe：通过困难负样本采样实现查询相关的组合图像检索

专知会员服务

7+阅读 · 7月20日

[ICCV2025]EAMamba：面向图像恢复的高效全能视觉状态空间模型

[ICCV2025]EAMamba：面向图像恢复的高效全能视觉状态空间模型

专知会员服务

5+阅读 · 7月1日

【KDD2023】协同过滤的高效联合超参数和架构搜索

【KDD2023】协同过滤的高效联合超参数和架构搜索

专知会员服务

23+阅读 · 2023年7月23日

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

专知会员服务

44+阅读 · 2020年4月30日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

专知

19+阅读 · 2021年3月28日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知

11+阅读 · 2020年8月28日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知

15+阅读 · 2020年7月23日

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

专知

11+阅读 · 2020年3月17日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

相关论文

SAFT: Structure-Aware Fine-Tuning of LLMs for AMR-to-Text Generation

Arxiv

0+阅读 · 12月10日

OpenSQP: A Reconfigurable Open-Source SQP Algorithm in Python for Nonlinear Optimization

Arxiv

0+阅读 · 12月5日

Lean Unet: A Compact Model for Image Segmentation

Arxiv

0+阅读 · 12月3日

FIGROTD: A Friendly-to-Handle Dataset for Image Guided Retrieval with Optional Text

Arxiv

0+阅读 · 11月27日

STATIC : Surface Temporal Affine for TIme Consistency in Video Monocular Depth Estimation

Arxiv

0+阅读 · 11月13日

相关基金

基于DASH的交互式三维视频系统建模

国家自然科学基金

1+阅读 · 2015年12月31日

基于概率计算的大规模MIMO检测方法

国家自然科学基金

1+阅读 · 2015年12月31日

基于Spark的大图数据最优子模式匹配查询方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

SDN数据平面中大规模流表的高性能查找方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

高维复杂结构数据降维

国家自然科学基金

10+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员