K-NN 远程估算 (Learned k-NN Distance Estimation) - 专知论文

会员服务 ·

0

估计/估计量 · Analysis · Learning · 近邻 · 推断 ·

2022 年 8 月 29 日

Learned k-NN Distance Estimation

翻译：K-NN 远程估算

Daichi Amagata,Yusuke Arai,Sumio Fujita,Takahiro Hara

Big data mining is well known to be an important task for data science, because it can provide useful observations and new knowledge hidden in given large datasets. Proximity-based data analysis is particularly utilized in many real-life applications. In such analysis, the distances to k nearest neighbors are usually employed, thus its main bottleneck is derived from data retrieval. Much efforts have been made to improve the efficiency of these analyses. However, they still incur large costs, because they essentially need many data accesses. To avoid this issue, we propose a machine-learning technique that quickly and accurately estimates the k-NN distances (i.e., distances to the k nearest neighbors) of a given query. We train a fully connected neural network model and utilize pivots to achieve accurate estimation. Our model is designed to have useful advantages: it infers distances to the k-NNs at a time, its inference time is O(1) (no data accesses are incurred), but it keeps high accuracy. Our experimental results and case studies on real datasets demonstrate the efficiency and effectiveness of our solution.

翻译：众所周知,大数据挖掘是数据科学的一项重要任务,因为它可以提供有用的观测和隐藏在特定大数据集中的新的知识。基于近距离的数据分析在许多现实应用中特别得到利用。在这种分析中,通常使用与近邻的距离,因此其主要瓶颈来自数据检索。为提高这些分析的效率,作出了很大努力。然而,由于它们基本上需要许多数据存取,因此仍然需要大量费用。为了避免这一问题,我们提议一种机器学习技术,迅速准确地估计给定查询的 k-NN 距离(即与近邻的距离)。我们训练一个完全连接的神经网络模型,并利用电流来实现准确的估计。我们的模型旨在具有有用的优势:推算出与K-NN的距离,其推论时间是O(1)(没有数据存取),但它保持很高的准确性。我们关于真实数据集的实验结果和案例研究显示了我们解决方案的效率和效力。

0

相关内容

估计/估计量

估计/估计量

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

超导量子计算中的量子纠错

国家自然科学基金

2+阅读 · 2014年12月31日

Lp-Minkowski 问题及相关的 Monge-Ampere 型方程

国家自然科学基金

0+阅读 · 2013年12月31日

纠缠及纠缠之外的量子关联刻画

国家自然科学基金

0+阅读 · 2013年12月31日

异构信息空间中时间感知的个性化语义实体搜索关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

鲁棒性压缩感知关键技术的研究

国家自然科学基金

0+阅读 · 2012年12月31日

去酰基化ghrelin改善脂肪组织炎症所致胰岛素抵抗的机制- - 调节性T细胞的作用

国家自然科学基金

0+阅读 · 2011年12月31日

超导量子电路中量子态的测量和控制

国家自然科学基金

0+阅读 · 2009年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于流形调和分析的三维形状匹配与检索技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios

Arxiv

0+阅读 · 2022年10月18日

Bagged $k$-Distance for Mode-Based Clustering Using the Probability of Localized Level Sets

Arxiv

0+阅读 · 2022年10月18日

Deep Idempotent Network for Efficient Single Image Blind Deblurring

Arxiv

0+阅读 · 2022年10月18日

Small Area Estimation using EBLUPs under the Nested Error Regression Model

Arxiv

0+阅读 · 2022年10月18日

Estimating the Cost of Executing Link Traversal based SPARQL Queries

Arxiv

0+阅读 · 2022年10月17日

TIVE: A Toolbox for Identifying Video Instance Segmentation Errors

Arxiv

0+阅读 · 2022年10月17日

Causal Inference for De-biasing Motion Estimation from Robotic Observational Data

Arxiv

0+阅读 · 2022年10月17日

Bayesian estimation of the autocovariance of a model error in time series

Arxiv

0+阅读 · 2022年10月14日

Estimation of High-Dimensional Markov-Switching VAR Models with an Approximate EM Algorithm

Arxiv

0+阅读 · 2022年10月14日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios

Arxiv

0+阅读 · 2022年10月18日

Bagged $k$-Distance for Mode-Based Clustering Using the Probability of Localized Level Sets

Arxiv

0+阅读 · 2022年10月18日

Deep Idempotent Network for Efficient Single Image Blind Deblurring

Arxiv

0+阅读 · 2022年10月18日

Small Area Estimation using EBLUPs under the Nested Error Regression Model

Arxiv

0+阅读 · 2022年10月18日

Estimating the Cost of Executing Link Traversal based SPARQL Queries

Arxiv

0+阅读 · 2022年10月17日

TIVE: A Toolbox for Identifying Video Instance Segmentation Errors

Arxiv

0+阅读 · 2022年10月17日

Causal Inference for De-biasing Motion Estimation from Robotic Observational Data

Arxiv

0+阅读 · 2022年10月17日

Bayesian estimation of the autocovariance of a model error in time series

Arxiv

0+阅读 · 2022年10月14日

Estimation of High-Dimensional Markov-Switching VAR Models with an Approximate EM Algorithm

Arxiv

0+阅读 · 2022年10月14日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

相关基金

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

超导量子计算中的量子纠错

国家自然科学基金

2+阅读 · 2014年12月31日

Lp-Minkowski 问题及相关的 Monge-Ampere 型方程

国家自然科学基金

0+阅读 · 2013年12月31日

纠缠及纠缠之外的量子关联刻画

国家自然科学基金

0+阅读 · 2013年12月31日

异构信息空间中时间感知的个性化语义实体搜索关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

鲁棒性压缩感知关键技术的研究

国家自然科学基金

0+阅读 · 2012年12月31日

去酰基化ghrelin改善脂肪组织炎症所致胰岛素抵抗的机制- - 调节性T细胞的作用

国家自然科学基金

0+阅读 · 2011年12月31日

超导量子电路中量子态的测量和控制

国家自然科学基金

0+阅读 · 2009年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于流形调和分析的三维形状匹配与检索技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员