查询性能预测的点评估可行性和鲁棒性 (On the Feasibility and Robustness of Pointwise Evaluation of Query Performance Prediction) - 专知论文

会员服务 ·

0

性能预测 · 检索模型 · 可行 · 鲁棒 · 相互独立的 ·

2023 年 4 月 1 日

On the Feasibility and Robustness of Pointwise Evaluation of Query Performance Prediction

翻译：查询性能预测的点评估可行性和鲁棒性

Suchana Datta,Debasis Ganguly,Derek Greene,Mandar Mitra

Despite the retrieval effectiveness of queries being mutually independent of one another, the evaluation of query performance prediction (QPP) systems has been carried out by measuring rank correlation over an entire set of queries. Such a listwise approach has a number of disadvantages, notably that it does not support the common requirement of assessing QPP for individual queries. In this paper, we propose a pointwise QPP framework that allows us to evaluate the quality of a QPP system for individual queries by measuring the deviations between each prediction versus the corresponding true value, and then aggregating the results over a set of queries. Our experiments demonstrate that this new approach leads to smaller variances in QPP evaluations across a range of different target metrics and retrieval models.

翻译：尽管查询的检索有效性是相互独立的，但查询性能预测（QPP）系统的评估是通过测量整个查询集上的排名相关性来进行的。这种全局方法有许多缺点，特别是它不支持评估单个查询的 QPP 的常见需求。在本文中，我们提出了一种点评 QPP 框架，通过测量每个预测与相应真实值之间的偏差，然后在一组查询上汇总结果，以允许我们评估 QPP 系统对不同目标度量和检索模型的单个查询的质量。我们的实验证明，这种新方法导致 QPP 评估在一系列不同目标指标和检索模型下具有更小的方差。

0

相关内容

性能预测

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【WWW2022】图上的聚类感知的监督对比学习，ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

【WWW2022】图上的聚类感知的监督对比学习，ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

专知会员服务

18+阅读 · 2022年3月28日

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

专知会员服务

93+阅读 · 2020年7月10日

【MIT】从视频物理系统进行因果发现，Causal Discovery in Physical Systems from Videos

【MIT】从视频物理系统进行因果发现，Causal Discovery in Physical Systems from Videos

专知会员服务

26+阅读 · 2020年7月4日

知识图谱嵌入模型的概率标定,Probability Calibration for Knowledge Graph Embedding Models

专知会员服务

36+阅读 · 2020年5月11日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

大型知识图谱检索算法的优化，19页pdf，Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

大型知识图谱检索算法的优化，19页pdf，Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

专知会员服务

45+阅读 · 2020年2月14日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

专知会员服务

35+阅读 · 2020年1月6日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

论文浅尝 | Interaction Embeddings for Prediction and Explanation

论文浅尝 | Interaction Embeddings for Prediction and Explanation

开放知识图谱

11+阅读 · 2019年2月1日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI界的State of the Art都在这里了

AI界的State of the Art都在这里了

机器之心

12+阅读 · 2018年12月10日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

肾虚血瘀型膝OA力学微环境稳态的软骨阻尼特性研究

国家自然科学基金

0+阅读 · 2016年12月31日

TCDD经SSeCKS/TRAF6通路诱导星形胶质细胞激活致神经毒性的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于参数估计理论的信息检索风险研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于位置敏感哈希的图像语义检索技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

针刺对宫内窘迫HIBD大鼠BNIP3介导的线粒体自噬保护机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

情境感知的个性化Web服务质量预测技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

磺化苯乙烯嵌段共聚物的微结构与性能的NMR研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于进程网络的Web服务组合建模和验证方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

α65293;硫辛酸防治2型糖尿病并发症及其线粒体修复机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Impact of Adversarial Training on Robustness and Generalizability of Language Models

Arxiv

0+阅读 · 2023年5月25日

Personalized Dictionary Learning for Heterogeneous Datasets

Arxiv

0+阅读 · 2023年5月24日

Expressive Losses for Verified Robustness via Convex Combinations

Arxiv

0+阅读 · 2023年5月23日

Flexible Grammar-Based Constrained Decoding for Language Models

Arxiv

0+阅读 · 2023年5月23日

Robust Instruction Optimization for Large Language Models with Distribution Shifts

Arxiv

0+阅读 · 2023年5月23日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance

Arxiv

13+阅读 · 2021年3月10日

$FM^2$: Field-matrixed Factorization Machines for Recommender Systems

Arxiv

16+阅读 · 2021年2月20日

From Knowledge Graph Embedding to Ontology Embedding: Region Based Representations of Relational Structures

Arxiv

10+阅读 · 2018年5月26日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

VIP会员

文章信息

相关主题

相互独立的

相关VIP内容

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【WWW2022】图上的聚类感知的监督对比学习，ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

【WWW2022】图上的聚类感知的监督对比学习，ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

专知会员服务

18+阅读 · 2022年3月28日

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

专知会员服务

93+阅读 · 2020年7月10日

【MIT】从视频物理系统进行因果发现，Causal Discovery in Physical Systems from Videos

【MIT】从视频物理系统进行因果发现，Causal Discovery in Physical Systems from Videos

专知会员服务

26+阅读 · 2020年7月4日

知识图谱嵌入模型的概率标定,Probability Calibration for Knowledge Graph Embedding Models

专知会员服务

36+阅读 · 2020年5月11日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

大型知识图谱检索算法的优化，19页pdf，Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

大型知识图谱检索算法的优化，19页pdf，Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

专知会员服务

45+阅读 · 2020年2月14日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

专知会员服务

35+阅读 · 2020年1月6日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

热门VIP内容

开通专知VIP会员享更多权益服务

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

乌克兰太空研究（2022-2024年） | 176页

【CMU博士论文】大型语言模型的隐性特性

国防领域人工智能走向何方？

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

论文浅尝 | Interaction Embeddings for Prediction and Explanation

论文浅尝 | Interaction Embeddings for Prediction and Explanation

开放知识图谱

11+阅读 · 2019年2月1日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI界的State of the Art都在这里了

AI界的State of the Art都在这里了

机器之心

12+阅读 · 2018年12月10日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

相关论文

Impact of Adversarial Training on Robustness and Generalizability of Language Models

Arxiv

0+阅读 · 2023年5月25日

Personalized Dictionary Learning for Heterogeneous Datasets

Arxiv

0+阅读 · 2023年5月24日

Expressive Losses for Verified Robustness via Convex Combinations

Arxiv

0+阅读 · 2023年5月23日

Flexible Grammar-Based Constrained Decoding for Language Models

Arxiv

0+阅读 · 2023年5月23日

Robust Instruction Optimization for Large Language Models with Distribution Shifts

Arxiv

0+阅读 · 2023年5月23日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance

Arxiv

13+阅读 · 2021年3月10日

$FM^2$: Field-matrixed Factorization Machines for Recommender Systems

Arxiv

16+阅读 · 2021年2月20日

From Knowledge Graph Embedding to Ontology Embedding: Region Based Representations of Relational Structures

Arxiv

10+阅读 · 2018年5月26日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

相关基金

肾虚血瘀型膝OA力学微环境稳态的软骨阻尼特性研究

国家自然科学基金

0+阅读 · 2016年12月31日

TCDD经SSeCKS/TRAF6通路诱导星形胶质细胞激活致神经毒性的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于参数估计理论的信息检索风险研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于位置敏感哈希的图像语义检索技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

针刺对宫内窘迫HIBD大鼠BNIP3介导的线粒体自噬保护机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

情境感知的个性化Web服务质量预测技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

磺化苯乙烯嵌段共聚物的微结构与性能的NMR研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于进程网络的Web服务组合建模和验证方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

α65293;硫辛酸防治2型糖尿病并发症及其线粒体修复机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员