OOD-DiskANN: 高效和可缩放图 ANNS 用于分发外查询 (OOD-DiskANN: Efficient and Scalable Graph ANNS for Out-of-Distribution Queries) - 专知论文

会员服务 ·

0

情景 · 图 · state-of-the-art · 查全率/召回率 · Performer ·

2022 年 11 月 30 日

OOD-DiskANN: Efficient and Scalable Graph ANNS for Out-of-Distribution Queries

翻译：OOD-DiskANN: 高效和可缩放图 ANNS 用于分发外查询

Shikhar Jaiswal,Ravishankar Krishnaswamy,Ankit Garg,Harsha Vardhan Simhadri,Sheshansh Agrawal

State-of-the-art algorithms for Approximate Nearest Neighbor Search (ANNS) such as DiskANN, FAISS-IVF, and HNSW build data dependent indices that offer substantially better accuracy and search efficiency over data-agnostic indices by overfitting to the index data distribution. When the query data is drawn from a different distribution - e.g., when index represents image embeddings and query represents textual embeddings - such algorithms lose much of this performance advantage. On a variety of datasets, for a fixed recall target, latency is worse by an order of magnitude or more for Out-Of-Distribution (OOD) queries as compared to In-Distribution (ID) queries. The question we address in this work is whether ANNS algorithms can be made efficient for OOD queries if the index construction is given access to a small sample set of these queries. We answer positively by presenting OOD-DiskANN, which uses a sparing sample (1% of index set size) of OOD queries, and provides up to 40% improvement in mean query latency over SoTA algorithms of a similar memory footprint. OOD-DiskANN is scalable and has the efficiency of graph-based ANNS indices. Some of our contributions can improve query efficiency for ID queries as well.

翻译：近距离近邻搜索( ANNS) 的最新算法, 如 DiskANN、 FAISIS- IVF 和 HNSW 等 DiskANN、 FASIS- IVF 和 HNSW 等数据集依赖指数, 其精确度和搜索效率大大高于数据分配指数的分布。当查询数据来自不同的分布 - 例如, 当索引代表图像嵌入和查询代表文本嵌入时 - 这种算法失去了大部分的性能优势。在一系列数据集中, 对于固定的调回目标来说, 与分配(ID) 查询相比, 长期性更差, 数据依赖性指数的大小或更大。我们在此工作中要解决的问题是: 当索引构建能代表图像嵌入图像和查询的少量样本时, 查询是否有效。我们通过展示 OOODD- DiskANNNNN 来做出肯定的答复, 它使用保留样本( 占指数设定大小的1%) 来查找 OOODD( OOOD) ) 查询中某些40% 的可理解性读性读取的 ONADADDDADD, 可以查询改进改进。

0

相关内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

神经酰胺调控Ca2+-ERS通路诱导涎腺腺样囊性癌细胞凋亡及其分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于天然产物Aspernigerin的新型几丁质合成抑制剂的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

闭环双相磁电材料的尺寸效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于阿尔茨海默病早期诊断的血浆中Abeta42/Abeta40比值的定量分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

氮化物半导体THz电子器件关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

粘弹性湍流减阻流动的POD低阶模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

高质量氧化锌基单晶薄膜的生长与稳定掺杂的实验与理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

Pharicin B稳定维甲酸受体的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Bayesian Optimization of Multiple Objectives with Different Latencies

Arxiv

0+阅读 · 2023年2月2日

Robust multi-item auction design using statistical learning: Overcoming uncertainty in bidders' types distributions

Arxiv

0+阅读 · 2023年2月2日

The Value of Out-of-Distribution Data

Arxiv

0+阅读 · 2023年2月2日

How Out-of-Distribution Data Hurts Semi-Supervised Learning

How Out-of-Distribution Data Hurts Semi-Supervised Learning

Arxiv

0+阅读 · 2023年2月1日

DEIM vs. leverage scores for time-parallel construction of problem-adapted basis functions

Arxiv

0+阅读 · 2023年2月1日

Efficient Data Mosaicing with Simulation-based Inference

Arxiv

0+阅读 · 2023年2月1日

Efficient Generalization and Transportation

Arxiv

0+阅读 · 2023年1月31日

Learning useful representations for shifting tasks and distributions

Arxiv

0+阅读 · 2023年1月31日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

VIP会员

文章信息

相关主题

state-of-the-art

查全率/召回率

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Bayesian Optimization of Multiple Objectives with Different Latencies

Arxiv

0+阅读 · 2023年2月2日

Robust multi-item auction design using statistical learning: Overcoming uncertainty in bidders' types distributions

Arxiv

0+阅读 · 2023年2月2日

The Value of Out-of-Distribution Data

Arxiv

0+阅读 · 2023年2月2日

How Out-of-Distribution Data Hurts Semi-Supervised Learning

How Out-of-Distribution Data Hurts Semi-Supervised Learning

Arxiv

0+阅读 · 2023年2月1日

DEIM vs. leverage scores for time-parallel construction of problem-adapted basis functions

Arxiv

0+阅读 · 2023年2月1日

Efficient Data Mosaicing with Simulation-based Inference

Arxiv

0+阅读 · 2023年2月1日

Efficient Generalization and Transportation

Arxiv

0+阅读 · 2023年1月31日

Learning useful representations for shifting tasks and distributions

Arxiv

0+阅读 · 2023年1月31日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

相关基金

神经酰胺调控Ca2+-ERS通路诱导涎腺腺样囊性癌细胞凋亡及其分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于天然产物Aspernigerin的新型几丁质合成抑制剂的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

闭环双相磁电材料的尺寸效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于阿尔茨海默病早期诊断的血浆中Abeta42/Abeta40比值的定量分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

氮化物半导体THz电子器件关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

粘弹性湍流减阻流动的POD低阶模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

高质量氧化锌基单晶薄膜的生长与稳定掺杂的实验与理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

Pharicin B稳定维甲酸受体的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员