Translated Title: 大型语言模型在相关性判断中的视角 (Perspectives on Large Language Models for Relevance Judgment) - 专知论文

会员服务 ·

0

相关性 · 大型语言模型 · 语言模型 · 信息检索 · 模型辅助 ·

2023 年 4 月 13 日

Perspectives on Large Language Models for Relevance Judgment

翻译：Translated Title: 大型语言模型在相关性判断中的视角

Guglielmo Faggioli,Laura Dietz,Charles Clarke,Gianluca Demartini,Matthias Hagen,Claudia Hauff,Noriko Kando,Evangelos Kanoulas,Martin Potthast,Benno Stein,Henning Wachsmuth

When asked, current large language models (LLMs) like ChatGPT claim that they can assist us with relevance judgments. Many researchers think this would not lead to credible IR research. In this perspective paper, we discuss possible ways for LLMs to assist human experts along with concerns and issues that arise. We devise a human-machine collaboration spectrum that allows categorizing different relevance judgment strategies, based on how much the human relies on the machine. For the extreme point of "fully automated assessment", we further include a pilot experiment on whether LLM-based relevance judgments correlate with judgments from trained human assessors. We conclude the paper by providing two opposing perspectives - for and against the use of LLMs for automatic relevance judgments - and a compromise perspective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR researchers. We hope to start a constructive discussion within the community to avoid a stale-mate during review, where work is dammed if is uses LLMs for evaluation and dammed if it doesn't.

翻译：Translated Abstract: 当问及当前的大型语言模型（如ChatGPT）是否可以辅助我们进行相关性判断时，许多研究人员认为这不会产生可信的信息检索研究结果。在本文中，我们讨论了大型语言模型辅助人类专家进行相关性判断的可能方式，同时也涉及到了可能出现的问题和关切点。我们构建了一个机器-人类合作的分析模型，以对不同的相关性判断策略进行分类，分类的标准是人类评估员需要多少依赖于机器的判定。在最极端的“完全自动化评估”点上，我们进一步进行了一个初步实验，以测试基于大型语言模型的相关性判断是否与训练有素的人类评估员一致。最后，我们提供了两种对于大型语言模型在自动化相关性判断方面的相对立的观点，以及一个妥协的视角，其中包括了我们对文献的分析、初步实验结果和作为信息检索研究人员的经验所得出的一些结论。希望本文能够在社区中引发建设性的讨论，以避免稿件在审阅过程中陷入僵局，要么因为使用了大型语言模型而被否决，要么因为没有使用而被否决。

0

相关内容

相关性

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【ACM Multimedia2021-tutorial】可信赖多媒体分析

【ACM Multimedia2021-tutorial】可信赖多媒体分析

专知会员服务

18+阅读 · 2021年10月20日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

12+阅读 · 2020年2月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

【论文推荐】最新七篇推荐系统相关论文—影响兴趣、知识Embeddings、音乐推荐、非结构化、一致性、显式和隐式特征、知识图谱

【论文推荐】最新七篇推荐系统相关论文—影响兴趣、知识Embeddings、音乐推荐、非结构化、一致性、显式和隐式特征、知识图谱

专知

14+阅读 · 2018年3月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

太平洋牡蛎IL17R/ACT1信号通路鉴定及其在天然免疫中的功能

国家自然科学基金

0+阅读 · 2015年12月31日

ABL基因剪接体DelE7-8-9和DelE8-9与慢性粒细胞白血病TKIs耐药的相关性研究

国家自然科学基金

0+阅读 · 2013年12月31日

应用代谢组学方法研究重症急性胰腺炎继发MOF的早期预警机制

国家自然科学基金

0+阅读 · 2013年12月31日

颜色-运动特征的绑定与视觉意识的关系

国家自然科学基金

0+阅读 · 2013年12月31日

PAEs宫内联合暴露致子代生殖毒性的代谢组学判断

国家自然科学基金

0+阅读 · 2012年12月31日

大肠癌肝转移诊断模型的建立并分析其预测（隐性）肝转移的价值

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TM4SF1调控Collagen/DDR1信号通路促进乳腺癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Tecto调节非洲爪蛙胚层决定与分化的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

CXCR7/SDF-1/ITAC信号调控前列腺癌细胞迁徙、侵袭及增殖的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

ThinkSum: Probabilistic reasoning over sets using large language models

Arxiv

0+阅读 · 2023年6月2日

Reducing Large Adaptation Spaces in Self-Adaptive Systems Using Machine Learning

Arxiv

0+阅读 · 2023年6月2日

Is Model Attention Aligned with Human Attention? An Empirical Study on Large Language Models for Code Generation

Arxiv

0+阅读 · 2023年6月2日

A Survey on In-context Learning

Arxiv

0+阅读 · 2023年6月1日

A Survey on ChatGPT: AI-Generated Contents, Challenges, and Solutions

Arxiv

54+阅读 · 2023年5月25日

A survey and taxonomy of loss functions in machine learning

Arxiv

25+阅读 · 2023年1月13日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Arxiv

18+阅读 · 2022年11月7日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

DeepSeek: Content Based Image Search & Retrieval

Arxiv

13+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

大型语言模型

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【ACM Multimedia2021-tutorial】可信赖多媒体分析

【ACM Multimedia2021-tutorial】可信赖多媒体分析

专知会员服务

18+阅读 · 2021年10月20日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

12+阅读 · 2020年2月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

【论文推荐】最新七篇推荐系统相关论文—影响兴趣、知识Embeddings、音乐推荐、非结构化、一致性、显式和隐式特征、知识图谱

【论文推荐】最新七篇推荐系统相关论文—影响兴趣、知识Embeddings、音乐推荐、非结构化、一致性、显式和隐式特征、知识图谱

专知

14+阅读 · 2018年3月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

ThinkSum: Probabilistic reasoning over sets using large language models

Arxiv

0+阅读 · 2023年6月2日

Reducing Large Adaptation Spaces in Self-Adaptive Systems Using Machine Learning

Arxiv

0+阅读 · 2023年6月2日

Is Model Attention Aligned with Human Attention? An Empirical Study on Large Language Models for Code Generation

Arxiv

0+阅读 · 2023年6月2日

A Survey on In-context Learning

Arxiv

0+阅读 · 2023年6月1日

A Survey on ChatGPT: AI-Generated Contents, Challenges, and Solutions

Arxiv

54+阅读 · 2023年5月25日

A survey and taxonomy of loss functions in machine learning

Arxiv

25+阅读 · 2023年1月13日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Arxiv

18+阅读 · 2022年11月7日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

DeepSeek: Content Based Image Search & Retrieval

Arxiv

13+阅读 · 2018年1月11日

相关基金

太平洋牡蛎IL17R/ACT1信号通路鉴定及其在天然免疫中的功能

国家自然科学基金

0+阅读 · 2015年12月31日

ABL基因剪接体DelE7-8-9和DelE8-9与慢性粒细胞白血病TKIs耐药的相关性研究

国家自然科学基金

0+阅读 · 2013年12月31日

应用代谢组学方法研究重症急性胰腺炎继发MOF的早期预警机制

国家自然科学基金

0+阅读 · 2013年12月31日

颜色-运动特征的绑定与视觉意识的关系

国家自然科学基金

0+阅读 · 2013年12月31日

PAEs宫内联合暴露致子代生殖毒性的代谢组学判断

国家自然科学基金

0+阅读 · 2012年12月31日

大肠癌肝转移诊断模型的建立并分析其预测（隐性）肝转移的价值

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TM4SF1调控Collagen/DDR1信号通路促进乳腺癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Tecto调节非洲爪蛙胚层决定与分化的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

CXCR7/SDF-1/ITAC信号调控前列腺癌细胞迁徙、侵袭及增殖的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员