维基为什么:回答和解释原因和影响问题 (WikiWhy: Answering and Explaining Cause-and-Effect Questions) - 专知论文

会员服务 ·

0

自动问答 · 情景 · 语言模型化 · 知识 (knowledge) · 端到端 ·

2022 年 11 月 30 日

WikiWhy: Answering and Explaining Cause-and-Effect Questions

翻译：维基为什么:回答和解释原因和影响问题

Matthew Ho,Aditya Sharma,Justin Chang,Michael Saxon,Sharon Levy,Yujie Lu,William Yang Wang

As large language models (LLMs) grow larger and more sophisticated, assessing their "reasoning" capabilities in natural language grows more challenging. Recent question answering (QA) benchmarks that attempt to assess reasoning are often limited by a narrow scope of covered situations and subject matters. We introduce WikiWhy, a QA dataset built around a novel auxiliary task: explaining why an answer is true in natural language. WikiWhy contains over 9,000 "why" question-answer-rationale triples, grounded on Wikipedia facts across a diverse set of topics. Each rationale is a set of supporting statements connecting the question to the answer. WikiWhy serves as a benchmark for the reasoning capabilities of LLMs because it demands rigorous explicit rationales for each answer to demonstrate the acquisition of implicit commonsense knowledge, which is unlikely to be easily memorized. GPT-3 baselines achieve only 38.7% human-evaluated correctness in the end-to-end answer & explain condition, leaving significant room for future improvements.

翻译：随着大型语言模型(LLMs)的扩大和更加复杂,评估其自然语言“理性”能力的挑战性越来越大。最近的问题回答(QA)基准试图评估推理,往往受到范围狭窄的覆盖情况和主题事项的限制。我们引入了围绕新颖辅助任务的QA数据集WikiWatter:解释答案为何在自然语言中正确。WikiWiwhy基于维基百科事实,包含9,000多个“为什么”问答三重题。每个理由都是将问题与答案联系起来的一套支持性声明。WikiHewyes是LLMs推理能力的一个基准,因为它要求为每项答案提供严格的明确理由,以证明获得隐含的普通知识,而这种知识不可能轻易被记忆化。 GPT-3 基线在端对端回答和解释条件中只达到38.7%的人经过评价的正确度,为将来的改进留下很大空间。

0

相关内容

自动问答

自动问答（Question Answering, QA）是指利用计算机自动回答用户所提出的问题以满足用户知识需求的任务。不同于现有搜索引擎，问答系统是信息服务的一种高级形式，系统返回用户的不再是基于关键词匹配排序的文档列表，而是精准的自然语言答案。近年来，随着人工智能的飞速发展，自动问答已经成为倍受关注且发展前景广泛的研究方向。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

Ca2+依赖的蛋白酶Calpain对突触后谷氨酸受体的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

线粒体calpain-1调控活性氧产生在糖尿病心肌病发生中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

缺氧诱导NHE1参与calpain介导ABCA1降解及胆固醇逆转运障碍

国家自然科学基金

0+阅读 · 2012年12月31日

PGE2通过EP1-Snail信号转导通路增强肝癌细胞侵袭性的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

HIF-1α20171;导急性肝衰竭信号转导途径及红景天苷干预机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Ter94在Hedgehog信号转导途径中的作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

NO3-胁迫下黄瓜CsNMAPK介导的信号转导途径研究

国家自然科学基金

0+阅读 · 2009年12月31日

Towards Answering Open-ended Ethical Quandary Questions

Arxiv

0+阅读 · 2023年2月1日

Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning

Arxiv

1+阅读 · 2023年1月31日

An Objective Metric for Explainable AI: How and Why to Estimate the Degree of Explainability

Arxiv

0+阅读 · 2023年1月31日

A psychological theory of explainability

Arxiv

16+阅读 · 2022年5月17日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Explainable Deep Learning: A Field Guide for the Uninitiated

Arxiv

51+阅读 · 2021年9月13日

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

Arxiv

10+阅读 · 2020年12月31日

Machine Reasoning Explainability

Arxiv

14+阅读 · 2020年9月1日

Directions for Explainable Knowledge-Enabled Systems

Directions for Explainable Knowledge-Enabled Systems

Arxiv

26+阅读 · 2020年3月17日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

语言模型化

知识 (knowledge)

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

相关论文

Towards Answering Open-ended Ethical Quandary Questions

Arxiv

0+阅读 · 2023年2月1日

Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning

Arxiv

1+阅读 · 2023年1月31日

An Objective Metric for Explainable AI: How and Why to Estimate the Degree of Explainability

Arxiv

0+阅读 · 2023年1月31日

A psychological theory of explainability

Arxiv

16+阅读 · 2022年5月17日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Explainable Deep Learning: A Field Guide for the Uninitiated

Arxiv

51+阅读 · 2021年9月13日

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

Arxiv

10+阅读 · 2020年12月31日

Machine Reasoning Explainability

Arxiv

14+阅读 · 2020年9月1日

Directions for Explainable Knowledge-Enabled Systems

Directions for Explainable Knowledge-Enabled Systems

Arxiv

26+阅读 · 2020年3月17日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

相关基金

Ca2+依赖的蛋白酶Calpain对突触后谷氨酸受体的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

线粒体calpain-1调控活性氧产生在糖尿病心肌病发生中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

缺氧诱导NHE1参与calpain介导ABCA1降解及胆固醇逆转运障碍

国家自然科学基金

0+阅读 · 2012年12月31日

PGE2通过EP1-Snail信号转导通路增强肝癌细胞侵袭性的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

HIF-1α20171;导急性肝衰竭信号转导途径及红景天苷干预机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Ter94在Hedgehog信号转导途径中的作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

NO3-胁迫下黄瓜CsNMAPK介导的信号转导途径研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员