混合主动人工科学文本检测的理解和解释探索 (Towards an Understanding and Explanation for Mixed-Initiative Artificial Scientific Text Detection) - 专知论文

会员服务 ·

0

文本检测 · 混合 · 视觉分析 · 用户研究 · 先验知识 ·

2023 年 4 月 11 日

Towards an Understanding and Explanation for Mixed-Initiative Artificial Scientific Text Detection

翻译：混合主动人工科学文本检测的理解和解释探索

Luoxuan Weng,Minfeng Zhu,Kam Kwai Wong,Shi Liu,Jiashun Sun,Hang Zhu,Dongming Han,Wei Chen

Large language models (LLMs) have gained popularity in various fields for their exceptional capability of generating human-like text. Their potential misuse has raised social concerns about plagiarism in academic contexts. However, effective artificial scientific text detection is a non-trivial task due to several challenges, including 1) the lack of a clear understanding of the differences between machine-generated and human-written scientific text, 2) the poor generalization performance of existing methods caused by out-of-distribution issues, and 3) the limited support for human-machine collaboration with sufficient interpretability during the detection process. In this paper, we first identify the critical distinctions between machine-generated and human-written scientific text through a quantitative experiment. Then, we propose a mixed-initiative workflow that combines human experts' prior knowledge with machine intelligence, along with a visual analytics prototype to facilitate efficient and trustworthy scientific text detection. Finally, we demonstrate the effectiveness of our approach through two case studies and a controlled user study with proficient researchers. We also provide design implications for interactive artificial text detection tools in high-stakes decision-making scenarios.

翻译：大型语言模型（LLMs）因其出色的生成人类化文本的能力而在各个领域得到了广泛的应用。它们的潜在滥用引起了有关学术抄袭的社会关切。然而，有效的人工科学文本检测是一项非常重要的任务，由于存在多种挑战，包括：1）缺乏清晰理解机器生成和人类撰写的科学文本之间的区别；2）现有方法的一般化效果差，由于出现了分布外的问题；3）检测过程中对于援助人机合作具有充分解释性的支持有限。在本文中，我们首先通过定量实验确定了机器生成和人类撰写的科学文本之间的关键区别。然后，我们提出了一种混合主动技术工作流，将人类专家的先验知识与机器智能相结合，以及一个视觉分析原型，以促进高效且可信的科学文本检测。最后，我们通过两个案例研究和一项针对熟练研究人员的控制用户研究证明了我们方法的有效性。我们还提供了高风险决策场景下交互式人工文本检测工具的设计启示。

0

相关内容

文本检测

加拿大国防研究和发展部《AI/ML在支持混合军事行动中情报和目标定位方面的优势和挑战》，Benefits and Challenges of AI/ML in Support of Intelligence and Targeting in Hybrid Military Operations

加拿大国防研究和发展部《AI/ML在支持混合军事行动中情报和目标定位方面的优势和挑战》，Benefits and Challenges of AI/ML in Support of Intelligence and Targeting in Hybrid Military Operations

专知会员服务

88+阅读 · 2022年4月5日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

最新《可解释人工智能XAI：机会与挑战》25页pdf，Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

最新《可解释人工智能XAI：机会与挑战》25页pdf，Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

专知会员服务

181+阅读 · 2020年6月23日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【KDD2019|讲座推荐】工业中可解释的人工智能：Fake News Research: Theories, Detection Strategies, and Open Problems

专知会员服务

67+阅读 · 2019年12月9日

【AAAI2020论文】关注实体以更好地理解文本（Attending to Entities for Better Text Understanding）

【AAAI2020论文】关注实体以更好地理解文本（Attending to Entities for Better Text Understanding）

专知会员服务

25+阅读 · 2019年11月15日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

每周一起读 | ACL 2019 & NAACL 2019：文本关系抽取专题沙龙

每周一起读 | ACL 2019 & NAACL 2019：文本关系抽取专题沙龙

PaperWeekly

43+阅读 · 2019年6月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

MHC-B单倍型鸡MD抗性相关miRNA的鉴定及功能靶基因研究

国家自然科学基金

0+阅读 · 2015年12月31日

生物医疗大数据集成分析的统计与计算方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

铜银金团簇与DNA碱基的相互作用

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

并轴II-VI/IV纳米线异质结构的电子学性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

具有官能团侧臂的氮杂冠醚金属配合物作为人工核酸酶催化DNA水解的构效关系研究

国家自然科学基金

0+阅读 · 2011年12月31日

Learning to Program with Natural Language

Arxiv

0+阅读 · 2023年5月29日

Self-Edit: Fault-Aware Code Editor for Code Generation

Arxiv

0+阅读 · 2023年5月26日

Abstractive Summary Generation for the Urdu Language

Arxiv

0+阅读 · 2023年5月25日

Testing Human Ability To Detect Deepfake Images of Human Faces

Arxiv

0+阅读 · 2023年5月25日

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

Arxiv

0+阅读 · 2023年5月25日

UFO: Unified Fact Obtaining for Commonsense Question Answering

Arxiv

0+阅读 · 2023年5月25日

Few-shot Event Detection: An Empirical Study and a Unified View

Arxiv

0+阅读 · 2023年5月25日

Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving

Arxiv

0+阅读 · 2023年5月25日

Code as Policies: Language Model Programs for Embodied Control

Arxiv

0+阅读 · 2023年5月25日

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Arxiv

51+阅读 · 2023年3月22日

VIP会员

文章信息

相关主题

相关VIP内容

加拿大国防研究和发展部《AI/ML在支持混合军事行动中情报和目标定位方面的优势和挑战》，Benefits and Challenges of AI/ML in Support of Intelligence and Targeting in Hybrid Military Operations

加拿大国防研究和发展部《AI/ML在支持混合军事行动中情报和目标定位方面的优势和挑战》，Benefits and Challenges of AI/ML in Support of Intelligence and Targeting in Hybrid Military Operations

专知会员服务

88+阅读 · 2022年4月5日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

最新《可解释人工智能XAI：机会与挑战》25页pdf，Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

最新《可解释人工智能XAI：机会与挑战》25页pdf，Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

专知会员服务

181+阅读 · 2020年6月23日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【KDD2019|讲座推荐】工业中可解释的人工智能：Fake News Research: Theories, Detection Strategies, and Open Problems

专知会员服务

67+阅读 · 2019年12月9日

【AAAI2020论文】关注实体以更好地理解文本（Attending to Entities for Better Text Understanding）

【AAAI2020论文】关注实体以更好地理解文本（Attending to Entities for Better Text Understanding）

专知会员服务

25+阅读 · 2019年11月15日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

每周一起读 | ACL 2019 & NAACL 2019：文本关系抽取专题沙龙

每周一起读 | ACL 2019 & NAACL 2019：文本关系抽取专题沙龙

PaperWeekly

43+阅读 · 2019年6月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

相关论文

Learning to Program with Natural Language

Arxiv

0+阅读 · 2023年5月29日

Self-Edit: Fault-Aware Code Editor for Code Generation

Arxiv

0+阅读 · 2023年5月26日

Abstractive Summary Generation for the Urdu Language

Arxiv

0+阅读 · 2023年5月25日

Testing Human Ability To Detect Deepfake Images of Human Faces

Arxiv

0+阅读 · 2023年5月25日

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

Arxiv

0+阅读 · 2023年5月25日

UFO: Unified Fact Obtaining for Commonsense Question Answering

Arxiv

0+阅读 · 2023年5月25日

Few-shot Event Detection: An Empirical Study and a Unified View

Arxiv

0+阅读 · 2023年5月25日

Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving

Arxiv

0+阅读 · 2023年5月25日

Code as Policies: Language Model Programs for Embodied Control

Arxiv

0+阅读 · 2023年5月25日

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Arxiv

51+阅读 · 2023年3月22日

相关基金

MHC-B单倍型鸡MD抗性相关miRNA的鉴定及功能靶基因研究

国家自然科学基金

0+阅读 · 2015年12月31日

生物医疗大数据集成分析的统计与计算方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

铜银金团簇与DNA碱基的相互作用

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

并轴II-VI/IV纳米线异质结构的电子学性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

具有官能团侧臂的氮杂冠醚金属配合物作为人工核酸酶催化DNA水解的构效关系研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员